[jira] [Updated] (YARN-9714) ZooKeeper connection in ZKRMStateStore leaks after RM transitioned to standby
[ https://issues.apache.org/jira/browse/YARN-9714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-9714: --- Attachment: YARN-9714.005.patch > ZooKeeper connection in ZKRMStateStore leaks after RM transitioned to standby > - > > Key: YARN-9714 > URL: https://issues.apache.org/jira/browse/YARN-9714 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Labels: memory-leak > Attachments: YARN-9714.001.patch, YARN-9714.002.patch, > YARN-9714.003.patch, YARN-9714.004.patch, YARN-9714.005.patch > > > Recently RM full GC happened in one of our clusters, after investigating the > dump memory and jstack, I found two places in RM may cause memory leaks after > RM transitioned to standby: > # Release cache cleanup timer in AbstractYarnScheduler never be canceled. > # ZooKeeper connection in ZKRMStateStore never be closed. > To solve those leaks, we should close the connection or cancel the timer when > services are stopping. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9714) ZooKeeper connection in ZKRMStateStore leaks after RM transitioned to standby
[ https://issues.apache.org/jira/browse/YARN-9714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917482#comment-16917482 ] Tao Yang commented on YARN-9714: {quote} Instead of comparing, how about checking for resourceManager.getZKManager() == null? This basically sync the code where zkManager initialization to closing it. {quote} Make sense to me. Attached v5 patch for this, thanks! > ZooKeeper connection in ZKRMStateStore leaks after RM transitioned to standby > - > > Key: YARN-9714 > URL: https://issues.apache.org/jira/browse/YARN-9714 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Labels: memory-leak > Attachments: YARN-9714.001.patch, YARN-9714.002.patch, > YARN-9714.003.patch, YARN-9714.004.patch, YARN-9714.005.patch > > > Recently RM full GC happened in one of our clusters, after investigating the > dump memory and jstack, I found two places in RM may cause memory leaks after > RM transitioned to standby: > # Release cache cleanup timer in AbstractYarnScheduler never be canceled. > # ZooKeeper connection in ZKRMStateStore never be closed. > To solve those leaks, we should close the connection or cancel the timer when > services are stopping. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9789) Disable Option for Write Ahead Logs of LogMutation
[ https://issues.apache.org/jira/browse/YARN-9789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917436#comment-16917436 ] Hadoop QA commented on YARN-9789: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 1s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 8s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 87m 15s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}149m 7s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 | | JIRA Issue | YARN-9789 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12978721/YARN-9789-001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 6253fb013fd6 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b1eee8b | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24648/testReport/ | | Max. process+thread count | 813 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24648/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Disable Option for Write Ahead Logs o
[jira] [Commented] (YARN-9664) Improve response of scheduler/app activities for better understanding
[ https://issues.apache.org/jira/browse/YARN-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917421#comment-16917421 ] Weiwei Yang commented on YARN-9664: --- Just got time to take a look at this. Wow, this is a huge set of changes. [~Tao Yang], have you verified the output of these changes are expected? I will try to go through the changes and hopefully, I can contribute enough review comments. Thx > Improve response of scheduler/app activities for better understanding > - > > Key: YARN-9664 > URL: https://issues.apache.org/jira/browse/YARN-9664 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-9664.001.patch, YARN-9664.002.patch > > > Currently some diagnostics are not easy enough to understand for common > users, and I found some places still need to be improved such as no partition > information and lacking of necessary activities. This issue is to improve > these shortcomings. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9738) Remove lock on ClusterNodeTracker#getNodeReport as it blocks application submission
[ https://issues.apache.org/jira/browse/YARN-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917408#comment-16917408 ] Bilwa S T commented on YARN-9738: - Hi [~bibinchundatt] i have fixed the testcases by adding null check in ClusterNodeTracker#exists and ClusterNodeTracker#getNode. > Remove lock on ClusterNodeTracker#getNodeReport as it blocks application > submission > --- > > Key: YARN-9738 > URL: https://issues.apache.org/jira/browse/YARN-9738 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-9738-001.patch, YARN-9738-002.patch, > YARN-9738-003.patch > > > *Env :* > Server OS :- UBUNTU > No. of Cluster Node:- 9120 NMs > Env Mode:- [Secure / Non secure]Secure > *Preconditions:* > ~9120 NM's was running > ~1250 applications was in running state > 35K applications was in pending state > *Test Steps:* > 1. Submit the application from 5 clients, each client 2 threads and total 10 > queues > 2. Once application submittion increases (for each application of > distributted shell will call getClusterNodes) > *ClientRMservice#getClusterNodes tries to get > ClusterNodeTracker#getNodeReport where map nodes is locked.* > {quote} > "IPC Server handler 36 on 45022" #246 daemon prio=5 os_prio=0 > tid=0x7f75095de000 nid=0x1949c waiting on condition [0x7f74cff78000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x7f759f6d8858> (a > java.util.concurrent.locks.ReentrantReadWriteLock$FairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.getNodeReport(ClusterNodeTracker.java:123) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getNodeReport(AbstractYarnScheduler.java:449) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.createNodeReports(ClientRMService.java:1067) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getClusterNodes(ClientRMService.java:992) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getClusterNodes(ApplicationClientProtocolPBServiceImpl.java:313) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:589) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:530) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2792) > {quote} > *Instead we can make nodes as concurrentHashMap and remove readlock* -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9714) ZooKeeper connection in ZKRMStateStore leaks after RM transitioned to standby
[ https://issues.apache.org/jira/browse/YARN-9714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917407#comment-16917407 ] Rohith Sharma K S commented on YARN-9714: - Thanks [~Tao Yang] for the patch! Instead of comparing, how about checking for resourceManager.getZKManager() == null? This basically sync the code where zkManager initialization to closing it. However I don't see any issue with current patch as well > ZooKeeper connection in ZKRMStateStore leaks after RM transitioned to standby > - > > Key: YARN-9714 > URL: https://issues.apache.org/jira/browse/YARN-9714 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Labels: memory-leak > Attachments: YARN-9714.001.patch, YARN-9714.002.patch, > YARN-9714.003.patch, YARN-9714.004.patch > > > Recently RM full GC happened in one of our clusters, after investigating the > dump memory and jstack, I found two places in RM may cause memory leaks after > RM transitioned to standby: > # Release cache cleanup timer in AbstractYarnScheduler never be canceled. > # ZooKeeper connection in ZKRMStateStore never be closed. > To solve those leaks, we should close the connection or cancel the timer when > services are stopping. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5913) Consolidate "resource" and "amResourceRequest" in ApplicationSubmissionContext
[ https://issues.apache.org/jira/browse/YARN-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917402#comment-16917402 ] Yousef Abu-Salah commented on YARN-5913: Can I get a more descriptive rundown of the problem? > Consolidate "resource" and "amResourceRequest" in ApplicationSubmissionContext > -- > > Key: YARN-5913 > URL: https://issues.apache.org/jira/browse/YARN-5913 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Yufei Gu >Priority: Minor > Labels: newbie > > Usage of these two variables overlaps and causes confusion. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5913) Consolidate "resource" and "amResourceRequest" in ApplicationSubmissionContext
[ https://issues.apache.org/jira/browse/YARN-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917399#comment-16917399 ] Yufei Gu commented on YARN-5913: [~ykabusalah] feel free to take any Jira without assignee. > Consolidate "resource" and "amResourceRequest" in ApplicationSubmissionContext > -- > > Key: YARN-5913 > URL: https://issues.apache.org/jira/browse/YARN-5913 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Yufei Gu >Priority: Minor > Labels: newbie > > Usage of these two variables overlaps and causes confusion. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6425) Move out FS state dump code out of method update()
[ https://issues.apache.org/jira/browse/YARN-6425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917398#comment-16917398 ] Yufei Gu commented on YARN-6425: [~ykabusalah] feel free to do that. > Move out FS state dump code out of method update() > -- > > Key: YARN-6425 > URL: https://issues.apache.org/jira/browse/YARN-6425 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Yufei Gu >Priority: Major > Labels: newbie++ > > Better to move out FS state dump code out of update() > {code} > if (LOG.isDebugEnabled()) { > if (--updatesToSkipForDebug < 0) { > updatesToSkipForDebug = UPDATE_DEBUG_FREQUENCY; > dumpSchedulerState(); > } > } > {code} > And, after that we should distinct between update call and update thread > duration like before YARN-6112. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9562) Add Java changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917374#comment-16917374 ] Hadoop QA commented on YARN-9562: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 37s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 9 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 54s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 45s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 36s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 37 new + 689 unchanged - 2 fixed = 726 total (was 691) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 19s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 33s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 50s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m 44s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}122m 35s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | | Nullcheck of NodeManager.context at line 535 of value previously dereferenced in org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStop() At NodeManager.java:535 of value previously dereferenced in org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStop() At NodeManager.java:[line 532] | | | Unused field:NodeManager.java | | Failed junit tests | hadoop.yarn.conf.TestYarnConfigurationFields | | | hadoop.yarn.server.nodemanager.amrmproxy.TestFederationInterceptor | \\ \\ || Subsystem || Report/Not
[jira] [Commented] (YARN-9755) RM fails to start with FileSystemBasedConfigurationProvider
[ https://issues.apache.org/jira/browse/YARN-9755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917237#comment-16917237 ] Hudson commented on YARN-9755: -- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17191 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17191/]) YARN-9755. Fixed RM failing to start when (eyang: rev 717c853873dd3b9112f5c15059a24655b8654607) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/FileSystemBasedConfigurationProvider.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMAdminService.java > RM fails to start with FileSystemBasedConfigurationProvider > --- > > Key: YARN-9755 > URL: https://issues.apache.org/jira/browse/YARN-9755 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9755-001.patch, YARN-9755-002.patch, > YARN-9755-003.patch, YARN-9755-004.patch > > > RM fails to start with below exception when > FileSystemBasedConfigurationProvider is used. > *Exception:* > {code} > 2019-08-16 12:05:33,802 ERROR > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > org.apache.hadoop.service.ServiceStateException: java.io.IOException: > java.io.IOException: Filesystem closed > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:173) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:109) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:868) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1281) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.reinitialize(ResourceManager.java:1312) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1335) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1328) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1328) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1379) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1567) > Caused by: java.io.IOException: java.io.IOException: Filesystem closed > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.FileBasedCSConfigurationProvider.loadConfiguration(FileBasedCSConfigurationProvider.java:64) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:346) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:445) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > ... 14 more > Caused by: java.io.IOException: Filesystem closed > at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:475) > at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1682) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1586) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1583) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1598) > at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1701) > at > org.apache.hadoop.yarn.FileSystemBase
[jira] [Commented] (YARN-9438) launchTime not written to state store for running applications
[ https://issues.apache.org/jira/browse/YARN-9438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917242#comment-16917242 ] Hudson commented on YARN-9438: -- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17191 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17191/]) YARN-9438. launchTime not written to state store for running (jhung: rev 8ef46595da6aefe4458aa7181670c3d9b13e7ec6) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java > launchTime not written to state store for running applications > -- > > Key: YARN-9438 > URL: https://issues.apache.org/jira/browse/YARN-9438 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.10.0 >Reporter: Jonathan Hung >Assignee: Jonathan Hung >Priority: Major > Labels: release-blocker > Fix For: 2.10.0, 3.3.0, 3.2.1, 3.1.3 > > Attachments: YARN-9438-branch-2.001.patch, > YARN-9438-branch-2.002.patch, YARN-9438.001.patch, YARN-9438.002.patch, > YARN-9438.003.patch, YARN-9438.004.patch > > > launchTime is only saved to state store after application finishes, so if > restart happens, any running applications will have launchTime set as -1 > (since this is the default timestamp of the recovery event). -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9770) Create a queue ordering policy which picks child queues with equal probability
[ https://issues.apache.org/jira/browse/YARN-9770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917227#comment-16917227 ] Jonathan Hung commented on YARN-9770: - Thanks [~haibochen]. Makes sense. Attached 003 addressing these comments. > Create a queue ordering policy which picks child queues with equal probability > -- > > Key: YARN-9770 > URL: https://issues.apache.org/jira/browse/YARN-9770 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Hung >Assignee: Jonathan Hung >Priority: Major > Labels: release-blocker > Attachments: YARN-9770.001.patch, YARN-9770.002.patch, > YARN-9770.003.patch > > > Ran some simulations with the default queue_utilization_ordering_policy: > An underutilized queue which receives an application with many (thousands) > resource requests will hog scheduler allocations for a long time (on the > order of a minute). In the meantime apps are getting submitted to all other > queues, which increases activeUsers in these queues, which drops user limit > in these queues to small values if minimum-user-limit-percent is configured > to small values (e.g. 10%). > To avoid this issue, we assign to queues with equal probability, to avoid > scenarios where queues don't get allocations for a long time. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9770) Create a queue ordering policy which picks child queues with equal probability
[ https://issues.apache.org/jira/browse/YARN-9770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hung updated YARN-9770: Attachment: YARN-9770.003.patch > Create a queue ordering policy which picks child queues with equal probability > -- > > Key: YARN-9770 > URL: https://issues.apache.org/jira/browse/YARN-9770 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Hung >Assignee: Jonathan Hung >Priority: Major > Labels: release-blocker > Attachments: YARN-9770.001.patch, YARN-9770.002.patch, > YARN-9770.003.patch > > > Ran some simulations with the default queue_utilization_ordering_policy: > An underutilized queue which receives an application with many (thousands) > resource requests will hog scheduler allocations for a long time (on the > order of a minute). In the meantime apps are getting submitted to all other > queues, which increases activeUsers in these queues, which drops user limit > in these queues to small values if minimum-user-limit-percent is configured > to small values (e.g. 10%). > To avoid this issue, we assign to queues with equal probability, to avoid > scenarios where queues don't get allocations for a long time. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9754) Add support for arbitrary DAG AM Simulator.
[ https://issues.apache.org/jira/browse/YARN-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917226#comment-16917226 ] Hadoop QA commented on YARN-9754: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 3s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 12s{color} | {color:orange} hadoop-tools/hadoop-sls: The patch generated 2 new + 12 unchanged - 0 fixed = 14 total (was 12) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 54s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 11m 42s{color} | {color:green} hadoop-sls in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 66m 24s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 | | JIRA Issue | YARN-9754 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12978695/YARN-9754.004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle | | uname | Linux 543a1d9eb17e 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 66cfa48 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/24646/artifact/out/diff-checkstyle-hadoop-tools_hadoop-sls.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24646/testReport/ | | Max. process+thread count | 429 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-sls U: hadoop-tools/hadoop-sls | | Console output |
[jira] [Updated] (YARN-9770) Create a queue ordering policy which picks child queues with equal probability
[ https://issues.apache.org/jira/browse/YARN-9770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hung updated YARN-9770: Attachment: (was: YARN-9770.003.patch) > Create a queue ordering policy which picks child queues with equal probability > -- > > Key: YARN-9770 > URL: https://issues.apache.org/jira/browse/YARN-9770 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Hung >Assignee: Jonathan Hung >Priority: Major > Labels: release-blocker > Attachments: YARN-9770.001.patch, YARN-9770.002.patch > > > Ran some simulations with the default queue_utilization_ordering_policy: > An underutilized queue which receives an application with many (thousands) > resource requests will hog scheduler allocations for a long time (on the > order of a minute). In the meantime apps are getting submitted to all other > queues, which increases activeUsers in these queues, which drops user limit > in these queues to small values if minimum-user-limit-percent is configured > to small values (e.g. 10%). > To avoid this issue, we assign to queues with equal probability, to avoid > scenarios where queues don't get allocations for a long time. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9770) Create a queue ordering policy which picks child queues with equal probability
[ https://issues.apache.org/jira/browse/YARN-9770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hung updated YARN-9770: Attachment: YARN-9770.003.patch > Create a queue ordering policy which picks child queues with equal probability > -- > > Key: YARN-9770 > URL: https://issues.apache.org/jira/browse/YARN-9770 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Hung >Assignee: Jonathan Hung >Priority: Major > Labels: release-blocker > Attachments: YARN-9770.001.patch, YARN-9770.002.patch > > > Ran some simulations with the default queue_utilization_ordering_policy: > An underutilized queue which receives an application with many (thousands) > resource requests will hog scheduler allocations for a long time (on the > order of a minute). In the meantime apps are getting submitted to all other > queues, which increases activeUsers in these queues, which drops user limit > in these queues to small values if minimum-user-limit-percent is configured > to small values (e.g. 10%). > To avoid this issue, we assign to queues with equal probability, to avoid > scenarios where queues don't get allocations for a long time. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9738) Remove lock on ClusterNodeTracker#getNodeReport as it blocks application submission
[ https://issues.apache.org/jira/browse/YARN-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917148#comment-16917148 ] Hadoop QA commented on YARN-9738: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 32s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 53s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 44s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 83m 56s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}132m 41s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 | | JIRA Issue | YARN-9738 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12978682/YARN-9738-003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux a4dc6e8d4d94 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 66cfa48 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24645/testReport/ | | Max. process+thread count | 802 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24645/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Remove lock on ClusterNodeTracker#getN
[jira] [Commented] (YARN-9561) Add C changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917140#comment-16917140 ] Eric Badger commented on YARN-9561: --- Hey [~eyang], in the meantime I got a chance to go back through your patch comments from a little while ago. Let me know if you have any questions. bq. In setup_container_paths, it would be good to use fprintf instead of fputs and include the actual path location. This helps system admin to debug the error more precisely. I don't think this should be added in the code that was created by this patch. The functions that are called from within {{setup_container_paths()}} are the ones that should give more fine-grained error logging. Because at the {{setup_container_paths()}} level I have no idea which directory failed to be created. I would only be able to give a large list. So I think this is an improvement for the underlying functions, but isn't relevant to this specific patch. bq. Can make_string be used instead of strbuf_append_fmt for readability reason and reduce the need of string format functions? The 16k size seem like a limit that is easy to reach. make_string may use more memory during string construction, but maybe it is safer? The 16k size is just the initial size to create {{upperdir=%s,workdir=%s,lowerdir=}}. Upperdir and workdir can only be single directories, so the only way the 16k limit would be exceeded is if these directory paths were absolutely gigantic. Inside of {{strbuf_append_fmt()}} we take into account how much memory needs to be allocated and reallocate that amount. So unless the system itself is out of memory, we will be able to allocate the full buffer regardless of the number of layers. This would be much harder to do if we used {{make_string()}} because we would have to concatenate the strings together or we would end up doing something like what {{strbuf_append_fmt()}} does anyway. bq. This part of code doesn't seem to have any effect: {noformat} de = is_docker_support_enabled() ? enabled : disabled; fprintf(stream, "%11s launch docker container: %2d appid containerid workdir " "container-script tokens pidfile nm-local-dirs nm-log-dirs " "docker-command-file resources ", de, LAUNCH_DOCKER_CONTAINER); {noformat} It has no effect on the patch related to the OCI code, yes. We fixed up the display code to be a little bit cleaner and so we modified this to be in line with the cleaner style. > Add C changes for the new RuncContainerRuntime > -- > > Key: YARN-9561 > URL: https://issues.apache.org/jira/browse/YARN-9561 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9561.001.patch, YARN-9561.002.patch, > YARN-9561.003.patch, YARN-9561.004.patch > > > This JIRA will be used to add the C changes to the container-executor native > binary that are necessary for the new RuncContainerRuntime. There should be > no changes to existing code paths. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9781) SchedConfCli to get current stored scheduler configuration
[ https://issues.apache.org/jira/browse/YARN-9781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917134#comment-16917134 ] Hadoop QA commented on YARN-9781: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 52s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 43s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 58s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 6s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 0 new + 21 unchanged - 3 fixed = 21 total (was 24) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 26s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 39s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 83m 50s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 25m 50s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 43s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}194m 48s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 | | JIRA Issue | YARN-9781 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12978658/YARN-9781-003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle | | uname | Linux 0548de174e9e 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x8
[jira] [Commented] (YARN-9785) Application gets activated even when AM memory has reached
[ https://issues.apache.org/jira/browse/YARN-9785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917093#comment-16917093 ] Hadoop QA commented on YARN-9785: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 38s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 17s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 55s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 40s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 60m 6s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.0 Server=19.03.0 Image:yetus/hadoop:bdbca0e53b4 | | JIRA Issue | YARN-9785 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12978680/YARN-9785-001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 5961f47f6030 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 66cfa48 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24644/testReport/ | | Max. process+thread count | 308 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24644/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Application gets activated even when AM memory has reached > -- > >
[jira] [Commented] (YARN-9561) Add C changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917076#comment-16917076 ] Eric Badger commented on YARN-9561: --- bq. Do you mean C side? Java side does not have privileges to run modprobe or lsmod due to lack of root privileges. I don't believe we need root privileges to run lsmod. It simply parses /proc/modules. On RHEL 7 this file is world-readable. So I think an {{lsmod | grep overlay}} would be sufficient. bq. It took me several days to restore my cluster to a working state with overlay kernel module installed. In the latest patch 004, mapreduce pi job fails when trying to run mapreduce pi: If you're failing in java then that means that the overlay mounts all worked and that runC has been correctly invoked. That's fantastic news! We're very close. bq. Do we need implicit mounting of Hadoop binaries to enable existing workload to run with runc? If not, what step can be used to run an example app? I don't have any Hadoop jars in the image or bind-mounted into the image. Instead I'm running using a Hadoop tarball in HDFS. {noformat:title=mapred-site.xml} mapreduce.application.framework.path ${fs.defaultFS}/user/ebadger/hadoop-3.3.0-SNAPSHOT.tar.gz#hadoop-mapreduce mapreduce.application.classpath ./hadoop-mapreduce/hadoop-3.3.0-SNAPSHOT/share/hadoop/common/*,./hadoop-mapreduce/hadoop-3.3.0-SNAPSHOT/share/hadoop/common/lib/*,./hadoop-mapreduce/hadoop-3.3.0-SNAPSHOT/share/hadoop/hdfs/*,./hadoop-mapreduce/hadoop-3.3.0-SNAPSHOT/share/hadoop/hdfs/lib/*,./hadoop-mapreduce/hadoop-3.3.0-SNAPSHOT/share/hadoop/yarn/*,./hadoop-mapreduce/hadoop-3.3.0-SNAPSHOT/share/hadoop/yarn/lib/*,./hadoop-mapreduce/hadoop-3.3.0-SNAPSHOT/share/hadoop/mapreduce/*,./hadoop-mapreduce/hadoop-3.3.0-SNAPSHOT/share/hadoop/mapreduce/lib/* {noformat} If you would like to bind-mount the hadoop jars instead, you can add them to the default mount list {{yarn.nodemanager.runtime.linux.docker.default-rw-mounts}} or {{yarn.nodemanager.runtime.linux.docker.default-ro-mounts}} (don't think you should need them to be writable). You can choose where in the image that you'd like them to be mounted and then set your classpath up to reflect where the jars are located. {noformat:title=Default Mount List Example} yarn.nodemanager.runtime.linux.docker.default-rw-mounts /var/run/nscd:/var/run/nscd {noformat} > Add C changes for the new RuncContainerRuntime > -- > > Key: YARN-9561 > URL: https://issues.apache.org/jira/browse/YARN-9561 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9561.001.patch, YARN-9561.002.patch, > YARN-9561.003.patch, YARN-9561.004.patch > > > This JIRA will be used to add the C changes to the container-executor native > binary that are necessary for the new RuncContainerRuntime. There should be > no changes to existing code paths. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9789) Disable Option for Write Ahead Logs of LogMutation
[ https://issues.apache.org/jira/browse/YARN-9789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9789: Attachment: YARN-9789-001.patch > Disable Option for Write Ahead Logs of LogMutation > -- > > Key: YARN-9789 > URL: https://issues.apache.org/jira/browse/YARN-9789 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9789-001.patch > > > When yarn.scheduler.configuration.store.max-logs is set to zero, the > YARNConfigurationStore (ZK, LevelDB) reads the write ahead logs from the > backend which is not needed. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9770) Create a queue ordering policy which picks child queues with equal probability
[ https://issues.apache.org/jira/browse/YARN-9770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917058#comment-16917058 ] Haibo Chen commented on YARN-9770: -- Thanks [~jhung] for the patch. The patch looks good to me overall. I have two minor comments. 1) Can we rename FairQueueOrderingPolicy to RandomQueueOrderingPolicy to reduce cognitive load as the notion of fairness has been used in FairScheduler for a different meaning? 2) In the constructor of RandomIterator, given that we kinda assume that the swap operation is efficient and we are only passing in ArrayList, how about we restrict the type to ArrayList? The checkstyle issue can also be addressed. > Create a queue ordering policy which picks child queues with equal probability > -- > > Key: YARN-9770 > URL: https://issues.apache.org/jira/browse/YARN-9770 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Hung >Assignee: Jonathan Hung >Priority: Major > Labels: release-blocker > Attachments: YARN-9770.001.patch, YARN-9770.002.patch > > > Ran some simulations with the default queue_utilization_ordering_policy: > An underutilized queue which receives an application with many (thousands) > resource requests will hog scheduler allocations for a long time (on the > order of a minute). In the meantime apps are getting submitted to all other > queues, which increases activeUsers in these queues, which drops user limit > in these queues to small values if minimum-user-limit-percent is configured > to small values (e.g. 10%). > To avoid this issue, we assign to queues with equal probability, to avoid > scenarios where queues don't get allocations for a long time. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9754) Add support for arbitrary DAG AM Simulator.
[ https://issues.apache.org/jira/browse/YARN-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917057#comment-16917057 ] Íñigo Goiri commented on YARN-9754: --- Thanks [~abmodi] the test with the times looks good. Minor comments: * Make assertEquals a static import. * Add some high level comment to testGetToBeScheduledContainers explaining you are asking for a container of X,Y an Z delay, etc. * Is there something else we can assert in TestSLSDagAMSimulator that relates to sls_dag.json? > Add support for arbitrary DAG AM Simulator. > --- > > Key: YARN-9754 > URL: https://issues.apache.org/jira/browse/YARN-9754 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9754.001.patch, YARN-9754.002.patch, > YARN-9754.003.patch, YARN-9754.004.patch > > > Currently, all map containers are requests as soon as Application master > comes up and then all reducer containers are requested. This doesn't get > flexibility to simulate behavior of DAG where various number of containers > would be requested at different time. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9438) launchTime not written to state store for running applications
[ https://issues.apache.org/jira/browse/YARN-9438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917051#comment-16917051 ] Haibo Chen commented on YARN-9438: -- +1 on the latest 004 patch > launchTime not written to state store for running applications > -- > > Key: YARN-9438 > URL: https://issues.apache.org/jira/browse/YARN-9438 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.10.0 >Reporter: Jonathan Hung >Assignee: Jonathan Hung >Priority: Major > Labels: release-blocker > Attachments: YARN-9438-branch-2.001.patch, > YARN-9438-branch-2.002.patch, YARN-9438.001.patch, YARN-9438.002.patch, > YARN-9438.003.patch, YARN-9438.004.patch > > > launchTime is only saved to state store after application finishes, so if > restart happens, any running applications will have launchTime set as -1 > (since this is the default timestamp of the recovery event). -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Reopened] (YARN-7585) NodeManager should go unhealthy when state store throws DBException
[ https://issues.apache.org/jira/browse/YARN-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hung reopened YARN-7585: - Reopening this issue to track the backport to branch-2. Waiting for YARN-8200 merge to backport this (since the fix for this JIRA touches some of that code). > NodeManager should go unhealthy when state store throws DBException > > > Key: YARN-7585 > URL: https://issues.apache.org/jira/browse/YARN-7585 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Labels: release-blocker > Fix For: 3.1.0 > > Attachments: YARN-7585.001.patch, YARN-7585.002.patch, > YARN-7585.003.patch > > > If work preserving recover is enabled the NM will not start up if the state > store does not initialise. However if the state store becomes unavailable > after that for any reason the NM will not go unhealthy. > Since the state store is not available new containers can not be started any > more and the NM should become unhealthy: > {code} > AMLauncher: Error launching appattempt_1508806289867_268617_01. Got > exception: org.apache.hadoop.yarn.exceptions.YarnException: > java.io.IOException: org.iq80.leveldb.DBException: IO error: > /dsk/app/var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state/028269.log: > Read-only file system > at o.a.h.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38) > at > o.a.h.y.s.n.cm.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:721) > ... > Caused by: java.io.IOException: org.iq80.leveldb.DBException: IO error: > /dsk/app/var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state/028269.log: > Read-only file system > at > o.a.h.y.s.n.r.NMLeveldbStateStoreService.storeApplication(NMLeveldbStateStoreService.java:374) > at > o.a.h.y.s.n.cm.ContainerManagerImpl.startContainerInternal(ContainerManagerImpl.java:848) > at > o.a.h.y.s.n.cm.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:712) > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9561) Add C changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917003#comment-16917003 ] Eric Yang commented on YARN-9561: - [~ebadger] {quote}For your 3rd point, I think it would be better to do this check in Java. That way we can catch the failure earlier since all containers trying to run runC will fail if overlay is not installed. For the check, I was thinking of doing an lsmod on "overlay"{quote} Do you mean C side? Java side does not have privileges to run modprobe or lsmod due to lack of root privileges. It took me several days to restore my cluster to a working state with overlay kernel module installed. In the latest patch 004, mapreduce pi job fails when trying to run mapreduce pi: {code} vars="YARN_CONTAINER_RUNTIME_TYPE=runc,YARN_CONTAINER_RUNTIME_RUNC_IMAGE=local/java-centos:latest" ./bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.0-SNAPSHOT.jar pi -Dmapreduce.map.env=$vars -Dmapreduce.reduce.env=$vars 10 100 {code} Observed log output: {code} 2019-08-27 11:46:08,377 INFO mapreduce.Job: Task Id : attempt_1566930487263_0002_m_02_2, Status : FAILED [2019-08-27 11:46:07.131]Exception from container-launch. Container id: container_1566930487263_0002_01_30 Exit code: 1 Exception message: Launch container failed [2019-08-27 11:46:07.133]Container exited with a non-zero exit code 1. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : Error: Could not find or load main class org.apache.hadoop.mapred.YarnChild [2019-08-27 11:46:07.134]Container exited with a non-zero exit code 1. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : Error: Could not find or load main class org.apache.hadoop.mapred.YarnChild {code} My base container image only contains centos:latest, and java-1.8.0-openjdk rpm installed. It does not have Hadoop binaries in the container. Do we need implicit mounting of Hadoop binaries to enable existing workload to run with runc? If not, what step can be used to run an example app? > Add C changes for the new RuncContainerRuntime > -- > > Key: YARN-9561 > URL: https://issues.apache.org/jira/browse/YARN-9561 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9561.001.patch, YARN-9561.002.patch, > YARN-9561.003.patch, YARN-9561.004.patch > > > This JIRA will be used to add the C changes to the container-executor native > binary that are necessary for the new RuncContainerRuntime. There should be > no changes to existing code paths. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9562) Add Java changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916977#comment-16916977 ] Eric Badger commented on YARN-9562: --- Patch 008 adds a bunch of checkstyle and findbugs cleanup > Add Java changes for the new RuncContainerRuntime > - > > Key: YARN-9562 > URL: https://issues.apache.org/jira/browse/YARN-9562 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9562.001.patch, YARN-9562.002.patch, > YARN-9562.003.patch, YARN-9562.004.patch, YARN-9562.005.patch, > YARN-9562.006.patch, YARN-9562.007.patch, YARN-9562.008.patch > > > This JIRA will be used to add the Java changes for the new > RuncContainerRuntime. This will work off of YARN-9560 to use much of the > existing DockerLinuxContainerRuntime code once it is moved up into an > abstract class that can be extended. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9562) Add Java changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-9562: -- Attachment: YARN-9562.008.patch > Add Java changes for the new RuncContainerRuntime > - > > Key: YARN-9562 > URL: https://issues.apache.org/jira/browse/YARN-9562 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9562.001.patch, YARN-9562.002.patch, > YARN-9562.003.patch, YARN-9562.004.patch, YARN-9562.005.patch, > YARN-9562.006.patch, YARN-9562.007.patch, YARN-9562.008.patch > > > This JIRA will be used to add the Java changes for the new > RuncContainerRuntime. This will work off of YARN-9560 to use much of the > existing DockerLinuxContainerRuntime code once it is moved up into an > abstract class that can be extended. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9755) RM fails to start with FileSystemBasedConfigurationProvider
[ https://issues.apache.org/jira/browse/YARN-9755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916924#comment-16916924 ] Prabhu Joseph commented on YARN-9755: - Thanks [~eyang]. > RM fails to start with FileSystemBasedConfigurationProvider > --- > > Key: YARN-9755 > URL: https://issues.apache.org/jira/browse/YARN-9755 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9755-001.patch, YARN-9755-002.patch, > YARN-9755-003.patch, YARN-9755-004.patch > > > RM fails to start with below exception when > FileSystemBasedConfigurationProvider is used. > *Exception:* > {code} > 2019-08-16 12:05:33,802 ERROR > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > org.apache.hadoop.service.ServiceStateException: java.io.IOException: > java.io.IOException: Filesystem closed > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:173) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:109) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:868) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1281) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.reinitialize(ResourceManager.java:1312) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1335) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1328) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1328) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1379) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1567) > Caused by: java.io.IOException: java.io.IOException: Filesystem closed > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.FileBasedCSConfigurationProvider.loadConfiguration(FileBasedCSConfigurationProvider.java:64) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:346) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:445) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > ... 14 more > Caused by: java.io.IOException: Filesystem closed > at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:475) > at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1682) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1586) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1583) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1598) > at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1701) > at > org.apache.hadoop.yarn.FileSystemBasedConfigurationProvider.getConfigurationInputStream(FileSystemBasedConfigurationProvider.java:62) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.FileBasedCSConfigurationProvider.loadConfiguration(FileBasedCSConfigurationProvider.java:56) > {code} > FileSystemBasedConfigurationProvider uses the cached FileSystem causing the > issue. > *Configs:* > {code} > yarn.resourcemanager.configuration.provider-classorg.apache.hadoop.yarn.FileSystemBasedConfigurationProvider > yarn.resourcemanager.configuration.file-system-based-store/yarn/conf > [yarn@yarndocker-1 yarn]$ hadoop fs -ls /yarn/conf > -rw-r--r-- 3 yarn supergroup 4138 2019-08-16 13:09 >
[jira] [Updated] (YARN-9773) Add QueueMetrics for Custom Resources/Resource vectors
[ https://issues.apache.org/jira/browse/YARN-9773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-9773: - Description: Although the custom resource metrics are calculated and saved as a QueueMetricsForCustomResources object within the QueueMetrics class, the JMX and Simon QueueMetrics do not report that information for custom resources. > Add QueueMetrics for Custom Resources/Resource vectors > -- > > Key: YARN-9773 > URL: https://issues.apache.org/jira/browse/YARN-9773 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > > Although the custom resource metrics are calculated and saved as a > QueueMetricsForCustomResources object within the QueueMetrics class, the JMX > and Simon QueueMetrics do not report that information for custom resources. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9773) Add QueueMetrics for Custom Resources/Resource vectors
[ https://issues.apache.org/jira/browse/YARN-9773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-9773: - Summary: Add QueueMetrics for Custom Resources/Resource vectors (was: PartitionQueueMetrics for Custom Resources/Resource vectors) > Add QueueMetrics for Custom Resources/Resource vectors > -- > > Key: YARN-9773 > URL: https://issues.apache.org/jira/browse/YARN-9773 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6425) Move out FS state dump code out of method update()
[ https://issues.apache.org/jira/browse/YARN-6425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916842#comment-16916842 ] Yousef Abu-Salah commented on YARN-6425: Can I take this issue? In addition, could I get a more descriptive definition of the problem and what needs to be accomplished? > Move out FS state dump code out of method update() > -- > > Key: YARN-6425 > URL: https://issues.apache.org/jira/browse/YARN-6425 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Yufei Gu >Priority: Major > Labels: newbie++ > > Better to move out FS state dump code out of update() > {code} > if (LOG.isDebugEnabled()) { > if (--updatesToSkipForDebug < 0) { > updatesToSkipForDebug = UPDATE_DEBUG_FREQUENCY; > dumpSchedulerState(); > } > } > {code} > And, after that we should distinct between update call and update thread > duration like before YARN-6112. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-5913) Consolidate "resource" and "amResourceRequest" in ApplicationSubmissionContext
[ https://issues.apache.org/jira/browse/YARN-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916838#comment-16916838 ] Yousef Abu-Salah edited comment on YARN-5913 at 8/27/19 4:21 PM: - Can I take this issue? In addition, could I get a more descriptive definition of the problem and what needs to be accomplished? was (Author: ykabusalah): Can I take this issue? > Consolidate "resource" and "amResourceRequest" in ApplicationSubmissionContext > -- > > Key: YARN-5913 > URL: https://issues.apache.org/jira/browse/YARN-5913 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Yufei Gu >Priority: Minor > Labels: newbie > > Usage of these two variables overlaps and causes confusion. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5913) Consolidate "resource" and "amResourceRequest" in ApplicationSubmissionContext
[ https://issues.apache.org/jira/browse/YARN-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916838#comment-16916838 ] Yousef Abu-Salah commented on YARN-5913: Can I take this issue? > Consolidate "resource" and "amResourceRequest" in ApplicationSubmissionContext > -- > > Key: YARN-5913 > URL: https://issues.apache.org/jira/browse/YARN-5913 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Yufei Gu >Priority: Minor > Labels: newbie > > Usage of these two variables overlaps and causes confusion. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9754) Add support for arbitrary DAG AM Simulator.
[ https://issues.apache.org/jira/browse/YARN-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916721#comment-16916721 ] Abhishek Modi commented on YARN-9754: - Thanks [~elgoiri] for review. I have addressed all the review comments in the latest patch. Since changing the AMSimulator will also impact MRAMSimulator and StreamAMSimulator, I will create a separate Jira for that. > Add support for arbitrary DAG AM Simulator. > --- > > Key: YARN-9754 > URL: https://issues.apache.org/jira/browse/YARN-9754 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9754.001.patch, YARN-9754.002.patch, > YARN-9754.003.patch, YARN-9754.004.patch > > > Currently, all map containers are requests as soon as Application master > comes up and then all reducer containers are requested. This doesn't get > flexibility to simulate behavior of DAG where various number of containers > would be requested at different time. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9754) Add support for arbitrary DAG AM Simulator.
[ https://issues.apache.org/jira/browse/YARN-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9754: Attachment: YARN-9754.004.patch > Add support for arbitrary DAG AM Simulator. > --- > > Key: YARN-9754 > URL: https://issues.apache.org/jira/browse/YARN-9754 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9754.001.patch, YARN-9754.002.patch, > YARN-9754.003.patch, YARN-9754.004.patch > > > Currently, all map containers are requests as soon as Application master > comes up and then all reducer containers are requested. This doesn't get > flexibility to simulate behavior of DAG where various number of containers > would be requested at different time. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9509) Capped cpu usage with cgroup strict-resource-usage based on a mulitplier
[ https://issues.apache.org/jira/browse/YARN-9509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916669#comment-16916669 ] Hadoop QA commented on YARN-9509: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 42s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 33s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 24s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 32s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 30s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 31s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 19s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 5 new + 219 unchanged - 0 fixed = 224 total (was 219) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 15s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 7s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 53s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 52s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 42s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 46s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}122m 46s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-
[jira] [Commented] (YARN-9009) Fix flaky test TestEntityGroupFSTimelineStore.testCleanLogs
[ https://issues.apache.org/jira/browse/YARN-9009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916657#comment-16916657 ] Hadoop QA commented on YARN-9009: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 10s{color} | {color:red} https://github.com/apache/hadoop/pull/438 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | GITHUB PR | https://github.com/apache/hadoop/pull/438 | | JIRA Issue | YARN-9009 | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-438/8/console | | versions | git=2.17.1 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. > Fix flaky test TestEntityGroupFSTimelineStore.testCleanLogs > --- > > Key: YARN-9009 > URL: https://issues.apache.org/jira/browse/YARN-9009 > Project: Hadoop YARN > Issue Type: Bug > Environment: Ubuntu 18.04 > java version "1.8.0_181" > Java(TM) SE Runtime Environment (build 1.8.0_181-b13) > Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode) > > Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; > 2018-06-17T13:33:14-05:00) >Reporter: OrDTesters >Assignee: OrDTesters >Priority: Minor > Fix For: 3.0.4, 3.1.2, 3.3.0, 3.2.1 > > Attachments: YARN-9009-trunk-001.patch > > > In TestEntityGroupFSTimelineStore, testCleanLogs fails when run after > testMoveToDone. > testCleanLogs fails because testMoveToDone moves a file into the same > directory that testCleanLogs cleans, causing testCleanLogs to clean 3 files, > instead of 2 as testCleanLogs expects. > To fix the failure of testCleanLogs, we can delete the file after the file is > moved by testMoveToDone. > Pull request link: [https://github.com/apache/hadoop/pull/438] -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9738) Remove lock on ClusterNodeTracker#getNodeReport as it blocks application submission
[ https://issues.apache.org/jira/browse/YARN-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-9738: Attachment: YARN-9738-003.patch > Remove lock on ClusterNodeTracker#getNodeReport as it blocks application > submission > --- > > Key: YARN-9738 > URL: https://issues.apache.org/jira/browse/YARN-9738 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-9738-001.patch, YARN-9738-002.patch, > YARN-9738-003.patch > > > *Env :* > Server OS :- UBUNTU > No. of Cluster Node:- 9120 NMs > Env Mode:- [Secure / Non secure]Secure > *Preconditions:* > ~9120 NM's was running > ~1250 applications was in running state > 35K applications was in pending state > *Test Steps:* > 1. Submit the application from 5 clients, each client 2 threads and total 10 > queues > 2. Once application submittion increases (for each application of > distributted shell will call getClusterNodes) > *ClientRMservice#getClusterNodes tries to get > ClusterNodeTracker#getNodeReport where map nodes is locked.* > {quote} > "IPC Server handler 36 on 45022" #246 daemon prio=5 os_prio=0 > tid=0x7f75095de000 nid=0x1949c waiting on condition [0x7f74cff78000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x7f759f6d8858> (a > java.util.concurrent.locks.ReentrantReadWriteLock$FairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.getNodeReport(ClusterNodeTracker.java:123) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getNodeReport(AbstractYarnScheduler.java:449) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.createNodeReports(ClientRMService.java:1067) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getClusterNodes(ClientRMService.java:992) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getClusterNodes(ApplicationClientProtocolPBServiceImpl.java:313) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:589) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:530) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2792) > {quote} > *Instead we can make nodes as concurrentHashMap and remove readlock* -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9468) Fix inaccurate documentations in Placement Constraints
[ https://issues.apache.org/jira/browse/YARN-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916640#comment-16916640 ] Hadoop QA commented on YARN-9468: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 39s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 31m 48s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 11s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 47m 24s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-717/9/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/717 | | JIRA Issue | YARN-9468 | | Optional Tests | dupname asflicense mvnsite | | uname | Linux 15bc56f1d87d 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 3329257 | | Max. process+thread count | 413 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-717/9/console | | versions | git=2.7.4 maven=3.3.9 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. > Fix inaccurate documentations in Placement Constraints > -- > > Key: YARN-9468 > URL: https://issues.apache.org/jira/browse/YARN-9468 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.2.0 >Reporter: hunshenshi >Assignee: hunshenshi >Priority: Major > > Document Placement Constraints > *First* > {code:java} > zk=3,NOTIN,NODE,zk:hbase=5,IN,RACK,zk:spark=7,CARDINALITY,NODE,hbase,1,3{code} > * place 5 containers with tag “hbase” with affinity to a rack on which > containers with tag “zk” are running (i.e., an “hbase” container > should{color:#ff} not{color} be placed at a rack where an “zk” container > is running, given that “zk” is the TargetTag of the second constraint); > The _*not*_ word in brackets should be delete. > > *Second* > {code:java} > PlacementSpec => "" | KeyVal;PlacementSpec > {code} > The semicolon should be replaced by colon > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9738) Remove lock on ClusterNodeTracker#getNodeReport as it blocks application submission
[ https://issues.apache.org/jira/browse/YARN-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-9738: Attachment: (was: YARN-9738-003.patch) > Remove lock on ClusterNodeTracker#getNodeReport as it blocks application > submission > --- > > Key: YARN-9738 > URL: https://issues.apache.org/jira/browse/YARN-9738 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-9738-001.patch, YARN-9738-002.patch > > > *Env :* > Server OS :- UBUNTU > No. of Cluster Node:- 9120 NMs > Env Mode:- [Secure / Non secure]Secure > *Preconditions:* > ~9120 NM's was running > ~1250 applications was in running state > 35K applications was in pending state > *Test Steps:* > 1. Submit the application from 5 clients, each client 2 threads and total 10 > queues > 2. Once application submittion increases (for each application of > distributted shell will call getClusterNodes) > *ClientRMservice#getClusterNodes tries to get > ClusterNodeTracker#getNodeReport where map nodes is locked.* > {quote} > "IPC Server handler 36 on 45022" #246 daemon prio=5 os_prio=0 > tid=0x7f75095de000 nid=0x1949c waiting on condition [0x7f74cff78000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x7f759f6d8858> (a > java.util.concurrent.locks.ReentrantReadWriteLock$FairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.getNodeReport(ClusterNodeTracker.java:123) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getNodeReport(AbstractYarnScheduler.java:449) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.createNodeReports(ClientRMService.java:1067) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getClusterNodes(ClientRMService.java:992) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getClusterNodes(ApplicationClientProtocolPBServiceImpl.java:313) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:589) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:530) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2792) > {quote} > *Instead we can make nodes as concurrentHashMap and remove readlock* -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9738) Remove lock on ClusterNodeTracker#getNodeReport as it blocks application submission
[ https://issues.apache.org/jira/browse/YARN-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-9738: Attachment: YARN-9738-003.patch > Remove lock on ClusterNodeTracker#getNodeReport as it blocks application > submission > --- > > Key: YARN-9738 > URL: https://issues.apache.org/jira/browse/YARN-9738 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-9738-001.patch, YARN-9738-002.patch > > > *Env :* > Server OS :- UBUNTU > No. of Cluster Node:- 9120 NMs > Env Mode:- [Secure / Non secure]Secure > *Preconditions:* > ~9120 NM's was running > ~1250 applications was in running state > 35K applications was in pending state > *Test Steps:* > 1. Submit the application from 5 clients, each client 2 threads and total 10 > queues > 2. Once application submittion increases (for each application of > distributted shell will call getClusterNodes) > *ClientRMservice#getClusterNodes tries to get > ClusterNodeTracker#getNodeReport where map nodes is locked.* > {quote} > "IPC Server handler 36 on 45022" #246 daemon prio=5 os_prio=0 > tid=0x7f75095de000 nid=0x1949c waiting on condition [0x7f74cff78000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x7f759f6d8858> (a > java.util.concurrent.locks.ReentrantReadWriteLock$FairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.getNodeReport(ClusterNodeTracker.java:123) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getNodeReport(AbstractYarnScheduler.java:449) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.createNodeReports(ClientRMService.java:1067) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getClusterNodes(ClientRMService.java:992) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getClusterNodes(ApplicationClientProtocolPBServiceImpl.java:313) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:589) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:530) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2792) > {quote} > *Instead we can make nodes as concurrentHashMap and remove readlock* -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9785) Application gets activated even when AM memory has reached
[ https://issues.apache.org/jira/browse/YARN-9785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-9785: Attachment: YARN-9785-001.patch > Application gets activated even when AM memory has reached > -- > > Key: YARN-9785 > URL: https://issues.apache.org/jira/browse/YARN-9785 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Blocker > Attachments: YARN-9785-001.patch > > > Configure below property in resource-types.xml > {quote} > yarn.resource-types > yarn.io/gpu > > {quote} > Submit applications even after AM limit for a queue is reached. Applications > get activated even after limit is reached > !queue.png! -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9579) the property of sharedcache in mapred-default.xml
[ https://issues.apache.org/jira/browse/YARN-9579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916628#comment-16916628 ] Hadoop QA commented on YARN-9579: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 38s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 32m 1s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 5s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 50s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 59m 41s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-848/10/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/848 | | JIRA Issue | YARN-9579 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml | | uname | Linux 1a74f0dda78e 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 3329257 | | Default Java | 1.8.0_222 | | Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-848/10/testReport/ | | Max. process+thread count | 1582 (vs. ulimit of 5500) | | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-848/10/console | | versions | git=2.7.4 maven=3.3.9 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. > the property of sharedcache in mapred-default.xml > - > > Key: YARN-9579 > URL: https://issues.apache.org/jira/browse/YARN-9579 > Pr
[jira] [Commented] (YARN-9479) Change String.equals to Objects.equals(String,String) to avoid possible NullPointerException
[ https://issues.apache.org/jira/browse/YARN-9479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916608#comment-16916608 ] Hadoop QA commented on YARN-9479: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 9s{color} | {color:red} https://github.com/apache/hadoop/pull/738 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | GITHUB PR | https://github.com/apache/hadoop/pull/738 | | JIRA Issue | YARN-9479 | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-738/11/console | | versions | git=2.17.1 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. > Change String.equals to Objects.equals(String,String) to avoid possible > NullPointerException > > > Key: YARN-9479 > URL: https://issues.apache.org/jira/browse/YARN-9479 > Project: Hadoop YARN > Issue Type: Bug >Reporter: bd2019us >Priority: Major > Attachments: 1.patch > > > Hello, > I found that the String "queueName" may cause potential risk of > NullPointerException since it is immediately used after initialization and > there is no null checker. One recommended API is > Objects.equals(String,String) which can avoid this exception. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9785) Application gets activated even when AM memory has reached
[ https://issues.apache.org/jira/browse/YARN-9785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916544#comment-16916544 ] Bibin A Chundatt commented on YARN-9785: Thank you [~BilwaST] for raising this Issue looks like API interface level issue DominantResourceCalculator#compare or wrong usage. *0* should be returned only when all the resources are equal. But in this case if one resource is greater and other is less the the compare returns *0* {code:java} /** * Compare two resources - if the value for every resource type for the lhs * is greater than that of the rhs, return 1. If the value for every resource * type in the lhs is less than the rhs, return -1. Otherwise, return 0 * * @param lhs resource to be compared * @param rhs resource to be compared * @return 0, 1, or -1 */ private int compare(Resource lhs, Resource rhs) { public int compare(Resource clusterResource, Resource lhs, Resource rhs, boolean singleType) { {code} Cluster resource <10,10,0> memory,cpu,gpu ||lhs||rhs|| |<1,0>|<0,1>|returns --> 0| ResourceCalculator#compare expects *0* only if values are equal. All the callers that expects all the fields to be lessthanEqual/greaterThanEqual are affected. [~rohithsharma]/[~tangzhankun]/[~sunil.gov...@gmail.com] > Application gets activated even when AM memory has reached > -- > > Key: YARN-9785 > URL: https://issues.apache.org/jira/browse/YARN-9785 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Blocker > > Configure below property in resource-types.xml > {quote} > yarn.resource-types > yarn.io/gpu > > {quote} > Submit applications even after AM limit for a queue is reached. Applications > get activated even after limit is reached > !queue.png! -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9781) SchedConfCli to get current stored scheduler configuration
[ https://issues.apache.org/jira/browse/YARN-9781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9781: Attachment: YARN-9781-003.patch > SchedConfCli to get current stored scheduler configuration > -- > > Key: YARN-9781 > URL: https://issues.apache.org/jira/browse/YARN-9781 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9781-001.patch, YARN-9781-002.patch, > YARN-9781-003.patch > > > SchedConfCLI currently allows to add / remove / remove queue. It does not > support get configuration which RMWebServices provides as part of YARN-8559. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9789) Disable Option for Write Ahead Logs of LogMutation
[ https://issues.apache.org/jira/browse/YARN-9789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9789: Parent: YARN-5734 Issue Type: Sub-task (was: Bug) > Disable Option for Write Ahead Logs of LogMutation > -- > > Key: YARN-9789 > URL: https://issues.apache.org/jira/browse/YARN-9789 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > > When yarn.scheduler.configuration.store.max-logs is set to zero, the > YARNConfigurationStore (ZK, LevelDB) reads the write ahead logs from the > backend which is not needed. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9789) Disable Option for Write Ahead Logs of LogMutation
[ https://issues.apache.org/jira/browse/YARN-9789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9789: Component/s: capacity scheduler > Disable Option for Write Ahead Logs of LogMutation > -- > > Key: YARN-9789 > URL: https://issues.apache.org/jira/browse/YARN-9789 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > > When yarn.scheduler.configuration.store.max-logs is set to zero, the > YARNConfigurationStore (ZK, LevelDB) reads the write ahead logs from the > backend which is not needed. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9789) Disable Option for Write Ahead Logs of LogMutation
Prabhu Joseph created YARN-9789: --- Summary: Disable Option for Write Ahead Logs of LogMutation Key: YARN-9789 URL: https://issues.apache.org/jira/browse/YARN-9789 Project: Hadoop YARN Issue Type: Bug Reporter: Prabhu Joseph Assignee: Prabhu Joseph When yarn.scheduler.configuration.store.max-logs is set to zero, the YARNConfigurationStore (ZK, LevelDB) reads the write ahead logs from the backend which is not needed. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9789) Disable Option for Write Ahead Logs of LogMutation
[ https://issues.apache.org/jira/browse/YARN-9789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9789: Affects Version/s: 3.3.0 > Disable Option for Write Ahead Logs of LogMutation > -- > > Key: YARN-9789 > URL: https://issues.apache.org/jira/browse/YARN-9789 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > > When yarn.scheduler.configuration.store.max-logs is set to zero, the > YARNConfigurationStore (ZK, LevelDB) reads the write ahead logs from the > backend which is not needed. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9788) Queue Management API - does not support parallel updates
[ https://issues.apache.org/jira/browse/YARN-9788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9788: Affects Version/s: 3.3.0 > Queue Management API - does not support parallel updates > > > Key: YARN-9788 > URL: https://issues.apache.org/jira/browse/YARN-9788 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > > Queue Management API - does not support parallel updates. When there are two > parallel schedule conf updates (logAndApplyMutation), the first update is > overwritten by the second one. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9788) Queue Management API - does not support parallel updates
[ https://issues.apache.org/jira/browse/YARN-9788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9788: Parent: YARN-5734 Issue Type: Sub-task (was: Bug) > Queue Management API - does not support parallel updates > > > Key: YARN-9788 > URL: https://issues.apache.org/jira/browse/YARN-9788 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > > Queue Management API - does not support parallel updates. When there are two > parallel schedule conf updates (logAndApplyMutation), the first update is > overwritten by the second one. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9788) Queue Management API - does not support parallel updates
Prabhu Joseph created YARN-9788: --- Summary: Queue Management API - does not support parallel updates Key: YARN-9788 URL: https://issues.apache.org/jira/browse/YARN-9788 Project: Hadoop YARN Issue Type: Bug Components: capacity scheduler Reporter: Prabhu Joseph Assignee: Prabhu Joseph Queue Management API - does not support parallel updates. When there are two parallel schedule conf updates (logAndApplyMutation), the first update is overwritten by the second one. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9714) ZooKeeper connection in ZKRMStateStore leaks after RM transitioned to standby
[ https://issues.apache.org/jira/browse/YARN-9714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916494#comment-16916494 ] Rohith Sharma K S commented on YARN-9714: - [~bibinchundatt] Would you please take a look at the patch? > ZooKeeper connection in ZKRMStateStore leaks after RM transitioned to standby > - > > Key: YARN-9714 > URL: https://issues.apache.org/jira/browse/YARN-9714 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Labels: memory-leak > Attachments: YARN-9714.001.patch, YARN-9714.002.patch, > YARN-9714.003.patch, YARN-9714.004.patch > > > Recently RM full GC happened in one of our clusters, after investigating the > dump memory and jstack, I found two places in RM may cause memory leaks after > RM transitioned to standby: > # Release cache cleanup timer in AbstractYarnScheduler never be canceled. > # ZooKeeper connection in ZKRMStateStore never be closed. > To solve those leaks, we should close the connection or cancel the timer when > services are stopping. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-9785) Application gets activated even when AM memory has reached
[ https://issues.apache.org/jira/browse/YARN-9785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916485#comment-16916485 ] Bilwa S T edited comment on YARN-9785 at 8/27/19 7:55 AM: -- In LeafQueue#activateApplications there is a lessThanOrEqual to check for amResourceRequest and amlimit for a queue which in turn calls DominantResourceAllocator#compare as GPU resource is zero. DominantResourceAllocator#compare returns 0 as lhsMemory > rhsMemory && lhsvcore < rhsVcore which would return true and then application gets activated. To solve this lessThanOrEqual should make sure that none of the lhs resource is greater than rhs resource was (Author: bilwast): In LeafQueue#activateApplications there is a lessThanOrEqual to check for amResourceRequest and amlimit for a queue which in turn calls DominantResourceAllocator#compare as GPU resource is zero. DominantResourceAllocator#compare returns 0 as lhsMemory > rhsMemory && lhsvcore < rhsVcore which would return true and then application gets activated. > Application gets activated even when AM memory has reached > -- > > Key: YARN-9785 > URL: https://issues.apache.org/jira/browse/YARN-9785 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Blocker > > Configure below property in resource-types.xml > {quote} > yarn.resource-types > yarn.io/gpu > > {quote} > Submit applications even after AM limit for a queue is reached. Applications > get activated even after limit is reached > !queue.png! -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9785) Application gets activated even when AM memory has reached
[ https://issues.apache.org/jira/browse/YARN-9785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916485#comment-16916485 ] Bilwa S T commented on YARN-9785: - In LeafQueue#activateApplications there is a lessThanOrEqual to check for amResourceRequest and amlimit for a queue which in turn calls DominantResourceAllocator#compare as GPU resource is zero. DominantResourceAllocator#compare returns 0 as lhsMemory > rhsMemory && lhsvcore < rhsVcore which would return true and then application gets activated. > Application gets activated even when AM memory has reached > -- > > Key: YARN-9785 > URL: https://issues.apache.org/jira/browse/YARN-9785 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Blocker > > Configure below property in resource-types.xml > {quote} > yarn.resource-types > yarn.io/gpu > > {quote} > Submit applications even after AM limit for a queue is reached. Applications > get activated even after limit is reached > !queue.png! -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9786) testCancelledDelegationToken fails intermittently
[ https://issues.apache.org/jira/browse/YARN-9786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916483#comment-16916483 ] Adam Antal commented on YARN-9786: -- Ah indeed. Quite strange, I searched in the apache jira, but there were no results for the test name. Closing as it's a dupe. > testCancelledDelegationToken fails intermittently > - > > Key: YARN-9786 > URL: https://issues.apache.org/jira/browse/YARN-9786 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.2.0 >Reporter: Adam Antal >Priority: Major > Attachments: testCancelledDelegationToken.txt > > > testCancelledDelegationToken[0] fails intermittently with the following > error/stack trace: > {noformat} > java.io.IOException: Server returned HTTP response code: 400 for URL: > http://localhost:8088/ws/v1/cluster/delegation-token > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.cancelDelegationToken(TestRMWebServicesDelegationTokenAuthentication.java:446) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken(TestRMWebServicesDelegationTokenAuthentication.java:267) > {noformat} > I'll attach an stdout as well to this issue. > It seems that the test helper infrastructure does not come up correctly. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-9786) testCancelledDelegationToken fails intermittently
[ https://issues.apache.org/jira/browse/YARN-9786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Antal resolved YARN-9786. -- Resolution: Duplicate > testCancelledDelegationToken fails intermittently > - > > Key: YARN-9786 > URL: https://issues.apache.org/jira/browse/YARN-9786 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.2.0 >Reporter: Adam Antal >Priority: Major > Attachments: testCancelledDelegationToken.txt > > > testCancelledDelegationToken[0] fails intermittently with the following > error/stack trace: > {noformat} > java.io.IOException: Server returned HTTP response code: 400 for URL: > http://localhost:8088/ws/v1/cluster/delegation-token > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.cancelDelegationToken(TestRMWebServicesDelegationTokenAuthentication.java:446) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken(TestRMWebServicesDelegationTokenAuthentication.java:267) > {noformat} > I'll attach an stdout as well to this issue. > It seems that the test helper infrastructure does not come up correctly. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-7145) Identify potential flaky unit tests
[ https://issues.apache.org/jira/browse/YARN-7145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth reassigned YARN-7145: Assignee: Julia Kinga Marton (was: Szilard Nemeth) > Identify potential flaky unit tests > --- > > Key: YARN-7145 > URL: https://issues.apache.org/jira/browse/YARN-7145 > Project: Hadoop YARN > Issue Type: Test > Components: nodemanager, resourcemanager >Reporter: Miklos Szegedi >Assignee: Julia Kinga Marton >Priority: Minor > Labels: newbie > Attachments: YARN-7145.000.patch, YARN-7145.001.patch > > > I intend to add a 200 milliseconds sleep into AsyncDispatcher, and run the > job to identify the tests that are potentially flaky. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9640) Slow event processing could cause too many attempt unregister events
[ https://issues.apache.org/jira/browse/YARN-9640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-9640: --- Attachment: YARN-9640-branch-3.2.001.patch > Slow event processing could cause too many attempt unregister events > > > Key: YARN-9640 > URL: https://issues.apache.org/jira/browse/YARN-9640 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Labels: scalability > Fix For: 3.3.0 > > Attachments: YARN-9640-branch-3.2.001.patch, YARN-9640.001.patch, > YARN-9640.002.patch, YARN-9640.003.patch > > > We found in one of our test cluster verification that the number attempt > unregister events is about 300k+. > # AM all containers completed. > # AMRMClientImpl send finishApplcationMaster > # AMRMClient check event 100ms the finish Status using > finishApplicationMaster request. > # AMRMClientImpl#unregisterApplicationMaster > {code:java} > while (true) { > FinishApplicationMasterResponse response = > rmClient.finishApplicationMaster(request); > if (response.getIsUnregistered()) { > break; > } > LOG.info("Waiting for application to be successfully unregistered."); > Thread.sleep(100); > } > {code} > # ApplicationMasterService finishApplicationMaster interface sends > unregister events on every status update. > We should send unregister event only once and cache event send , ignore and > send not unregistered response back to AM not overloading the event queue. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9783) Remove low-level zookeeper test to be able to build Hadoop against zookeeper 3.5.5
[ https://issues.apache.org/jira/browse/YARN-9783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szalay-Beko Mate updated YARN-9783: --- Attachment: (was: YARN-9783.001.patch) > Remove low-level zookeeper test to be able to build Hadoop against zookeeper > 3.5.5 > -- > > Key: YARN-9783 > URL: https://issues.apache.org/jira/browse/YARN-9783 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szalay-Beko Mate >Assignee: Szalay-Beko Mate >Priority: Major > Attachments: YARN-9783.001.patch > > > ZooKeeper 3.5.5 release is the latest stable one. It contains many new > features (including SSL related improvements which are very important for > production use; see [the release > notes|https://zookeeper.apache.org/doc/r3.5.5/releasenotes.html]). Yet there > should be no backward incompatible changes on the API, so the applications > using ZooKeeper clients should be built against the new zookeeper without any > problem and the new ZooKeeper client should work with the older (3.4) servers > without any issue, at least until someone is start to use new functionality. > The aim of this ticket is not to change the ZooKeeper version used by Hadoop > YARN yet, but to enable people to rebuild and test Hadoop with the new > ZooKeeper version. > Currently the Hadoop build (with ZooKeeper 3.5.5) fails because of a YARN > test case: > [TestSecureRegistry.testLowlevelZKSaslLogin()|https://github.com/apache/hadoop/blob/a0da1ec01051108b77f86799dd5e97563b2a3962/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/test/java/org/apache/hadoop/registry/secure/TestSecureRegistry.java#L64]. > This test case seems to use low-level ZooKeeper internal code, which changed > in the new ZooKeeper version. Although I am not sure what was the original > reasoning of the inclusion of this test in the YARN code, I propose to remove > it, and if there is still any missing test case in ZooKeeper, then let's > issue a ZooKeeper ticket to test this scenario there. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9783) Remove low-level zookeeper test to be able to build Hadoop against zookeeper 3.5.5
[ https://issues.apache.org/jira/browse/YARN-9783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szalay-Beko Mate updated YARN-9783: --- Attachment: YARN-9783.001.patch > Remove low-level zookeeper test to be able to build Hadoop against zookeeper > 3.5.5 > -- > > Key: YARN-9783 > URL: https://issues.apache.org/jira/browse/YARN-9783 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szalay-Beko Mate >Assignee: Szalay-Beko Mate >Priority: Major > Attachments: YARN-9783.001.patch > > > ZooKeeper 3.5.5 release is the latest stable one. It contains many new > features (including SSL related improvements which are very important for > production use; see [the release > notes|https://zookeeper.apache.org/doc/r3.5.5/releasenotes.html]). Yet there > should be no backward incompatible changes on the API, so the applications > using ZooKeeper clients should be built against the new zookeeper without any > problem and the new ZooKeeper client should work with the older (3.4) servers > without any issue, at least until someone is start to use new functionality. > The aim of this ticket is not to change the ZooKeeper version used by Hadoop > YARN yet, but to enable people to rebuild and test Hadoop with the new > ZooKeeper version. > Currently the Hadoop build (with ZooKeeper 3.5.5) fails because of a YARN > test case: > [TestSecureRegistry.testLowlevelZKSaslLogin()|https://github.com/apache/hadoop/blob/a0da1ec01051108b77f86799dd5e97563b2a3962/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/test/java/org/apache/hadoop/registry/secure/TestSecureRegistry.java#L64]. > This test case seems to use low-level ZooKeeper internal code, which changed > in the new ZooKeeper version. Although I am not sure what was the original > reasoning of the inclusion of this test in the YARN code, I propose to remove > it, and if there is still any missing test case in ZooKeeper, then let's > issue a ZooKeeper ticket to test this scenario there. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org