[jira] [Commented] (YARN-9738) Remove lock on ClusterNodeTracker#getNodeReport as it blocks application submission
[ https://issues.apache.org/jira/browse/YARN-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905908#comment-16905908 ] Hadoop QA commented on YARN-9738: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 56s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 51s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 36s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}104m 47s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}170m 19s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer | | | hadoop.yarn.server.resourcemanager.scheduler.TestAbstractYarnScheduler | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 | | JIRA Issue | YARN-9738 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12977442/YARN-9738-002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 999b769a2c7b 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 454420e | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/24548/artifact/out/p
[jira] [Commented] (YARN-9738) Remove lock on ClusterNodeTracker#getNodeReport as it blocks application submission
[ https://issues.apache.org/jira/browse/YARN-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905952#comment-16905952 ] Sunil Govindan commented on YARN-9738: -- Hi [~BilwaST] {{nodes}} is operated under read and write lock in ClusterNodeTracker. Now converting the same to a concurrent hash map also impacts other code lines too. If we are not carefully using writeLock and concurrentHashMap, then it could cause redundant locking. Thanks > Remove lock on ClusterNodeTracker#getNodeReport as it blocks application > submission > --- > > Key: YARN-9738 > URL: https://issues.apache.org/jira/browse/YARN-9738 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-9738-001.patch, YARN-9738-002.patch > > > *Env :* > Server OS :- UBUNTU > No. of Cluster Node:- 9120 NMs > Env Mode:- [Secure / Non secure]Secure > *Preconditions:* > ~9120 NM's was running > ~1250 applications was in running state > 35K applications was in pending state > *Test Steps:* > 1. Submit the application from 5 clients, each client 2 threads and total 10 > queues > 2. Once application submittion increases (for each application of > distributted shell will call getClusterNodes) > *ClientRMservice#getClusterNodes tries to get > ClusterNodeTracker#getNodeReport where map nodes is locked.* > {quote} > "IPC Server handler 36 on 45022" #246 daemon prio=5 os_prio=0 > tid=0x7f75095de000 nid=0x1949c waiting on condition [0x7f74cff78000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x7f759f6d8858> (a > java.util.concurrent.locks.ReentrantReadWriteLock$FairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.getNodeReport(ClusterNodeTracker.java:123) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getNodeReport(AbstractYarnScheduler.java:449) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.createNodeReports(ClientRMService.java:1067) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getClusterNodes(ClientRMService.java:992) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getClusterNodes(ApplicationClientProtocolPBServiceImpl.java:313) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:589) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:530) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2792) > {quote} > *Instead we can make nodes as concurrentHashMap and remove readlock* -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9676) Add DEBUG and TRACE level messages to AppLogAggregatorImpl and connected classes
[ https://issues.apache.org/jira/browse/YARN-9676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906018#comment-16906018 ] Peter Bacsko commented on YARN-9676: +1 LGTM (non-binding) > Add DEBUG and TRACE level messages to AppLogAggregatorImpl and connected > classes > > > Key: YARN-9676 > URL: https://issues.apache.org/jira/browse/YARN-9676 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Adam Antal >Assignee: Adam Antal >Priority: Major > > During the development of the last items of YARN-6875, it was typically > difficult to extract information about the internal state of some log > aggregation related classes (e.g. {{AppLogAggregatiorImpl}} and > {{LogAggregationFileController}}). > On my fork I added a few more messages to those classes like: > - displaying the number of log aggregation cycles > - displaying the names of the files currently considered for log aggregation > by containers > - immediately displaying any exception caught (and sent to the RM in the > diagnostic messages) during the log aggregation process. > Those messages were quite useful for debugging if any issue occurs, but > otherwise it flooded the NM log file with these messages that are usually not > needed. I suggest to add (some of) these messages in DEBUG or TRACE level. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9444) YARN API ResourceUtils's getRequestedResourcesFromConfig doesn't recognize yarn.io/gpu as a valid resource
[ https://issues.apache.org/jira/browse/YARN-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-9444: - Summary: YARN API ResourceUtils's getRequestedResourcesFromConfig doesn't recognize yarn.io/gpu as a valid resource (was: YARN Api Resource Utils's getRequestedResourcesFromConfig doesn't recognise yarn.io/gpu as a valid resource type) > YARN API ResourceUtils's getRequestedResourcesFromConfig doesn't recognize > yarn.io/gpu as a valid resource > -- > > Key: YARN-9444 > URL: https://issues.apache.org/jira/browse/YARN-9444 > Project: Hadoop YARN > Issue Type: Bug > Components: api >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Minor > Attachments: YARN-9444.001.patch > > > The original issue was the jobclient test did not send the requested resource > type, when specified in the command line eg: > {code:java} > hadoop jar hadoop-mapreduce-client-jobclient-tests.jar sleep > -Dmapreduce.reduce.resource.yarn.io/gpu=1 -m 10 -r 1 -mt 9 > {code} > After some investigation, it turned out it only affects resource types with > name containing '.' characters. And the root cause is regexp from the > getRequestedResourcesFromConfig method. > {code:java} > "^" + Pattern.quote(prefix) + "[^.]+$" > {code} > This regexp explicitly forbids any dots in the resource type name, which is > inconsistent with the default resource type for gpu and fpga, which are > yarn.io/gpu and yarn.io/fpga respectively. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9741) [JDK11] TestAHSWebServices.testAbout fails
Adam Antal created YARN-9741: Summary: [JDK11] TestAHSWebServices.testAbout fails Key: YARN-9741 URL: https://issues.apache.org/jira/browse/YARN-9741 Project: Hadoop YARN Issue Type: Bug Components: timelineservice Affects Versions: 3.2.0 Reporter: Adam Antal On openjdk-11.0.2 TestAHSWebServices.testAbout[0] fails consistently with the following stack trace: {noformat} [ERROR] Tests run: 40, Failures: 6, Errors: 0, Skipped: 0, Time elapsed: 7.9 s <<< FAILURE! - in org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices [ERROR] testAbout[0](org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices) Time elapsed: 0.241 s <<< FAILURE! org.junit.ComparisonFailure: expected: but was: at org.junit.Assert.assertEquals(Assert.java:115) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices.testAbout(TestAHSWebServices.java:333) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runners.Suite.runChild(Suite.java:128) at org.junit.runners.Suite.runChild(Suite.java:27) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9742) [JDK11] TestTimelineWebServicesWithSSL.testPutEntities fails
Adam Antal created YARN-9742: Summary: [JDK11] TestTimelineWebServicesWithSSL.testPutEntities fails Key: YARN-9742 URL: https://issues.apache.org/jira/browse/YARN-9742 Project: Hadoop YARN Issue Type: Bug Components: timelineservice Affects Versions: 3.2.0 Reporter: Adam Antal Tested on openjdk-11.0.2 on a Mac. Stack trace: {noformat} [ERROR] Tests run: 3, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 8.206 s <<< FAILURE! - in org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServicesWithSSL [ERROR] testPutEntities(org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServicesWithSSL) Time elapsed: 0.366 s <<< ERROR! com.sun.jersey.api.client.ClientHandlerException: java.io.IOException: HTTPS hostname wrong: should be <0.0.0.0> at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155) at org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineJerseyRetryFilter$1.run(TimelineConnector.java:392) at org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineClientConnectionRetry.retryOn(TimelineConnector.java:335) at org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineJerseyRetryFilter.handle(TimelineConnector.java:405) at com.sun.jersey.api.client.Client.handle(Client.java:652) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682) at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:570) at org.apache.hadoop.yarn.client.api.impl.TimelineWriter.doPostingObject(TimelineWriter.java:152) at org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServicesWithSSL$TestTimelineClient$1.doPostingObject(TestTimelineWebServicesWithSSL.java:139) at org.apache.hadoop.yarn.client.api.impl.TimelineWriter$1.run(TimelineWriter.java:115) at org.apache.hadoop.yarn.client.api.impl.TimelineWriter$1.run(TimelineWriter.java:112) at java.base/java.security.AccessController.doPrivileged(Native Method) at java.base/javax.security.auth.Subject.doAs(Subject.java:423) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891) at org.apache.hadoop.yarn.client.api.impl.TimelineWriter.doPosting(TimelineWriter.java:112) at org.apache.hadoop.yarn.client.api.impl.TimelineWriter.putEntities(TimelineWriter.java:92) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:178) at org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServicesWithSSL.testPutEntities(TestTimelineWebServicesWithSSL.java:110) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) at org.apache.maven.surefire.booter.ForkedBooter
[jira] [Created] (YARN-9743) [JDK11] TestTimelineWebServices.testContextFactory fails
Adam Antal created YARN-9743: Summary: [JDK11] TestTimelineWebServices.testContextFactory fails Key: YARN-9743 URL: https://issues.apache.org/jira/browse/YARN-9743 Project: Hadoop YARN Issue Type: Bug Components: timelineservice Affects Versions: 3.2.0 Reporter: Adam Antal Tested on OpenJDK 11.0.2 on a Mac. Stack trace: {noformat} [ERROR] Tests run: 29, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 36.016 s <<< FAILURE! - in org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices [ERROR] testContextFactory(org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices) Time elapsed: 1.031 s <<< ERROR! java.lang.ClassNotFoundException: com.sun.xml.internal.bind.v2.ContextFactory at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583) at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) at java.base/java.lang.Class.forName0(Native Method) at java.base/java.lang.Class.forName(Class.java:315) at org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.ContextFactory.newContext(ContextFactory.java:85) at org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.ContextFactory.createContext(ContextFactory.java:112) at org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices.testContextFactory(TestTimelineWebServices.java:1039) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-9728) ResourceManager REST API can produce an illegal xml response
[ https://issues.apache.org/jira/browse/YARN-9728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph reassigned YARN-9728: --- Assignee: Prabhu Joseph > ResourceManager REST API can produce an illegal xml response > - > > Key: YARN-9728 > URL: https://issues.apache.org/jira/browse/YARN-9728 > Project: Hadoop YARN > Issue Type: Bug > Components: api, resourcemanager >Affects Versions: 2.7.3 >Reporter: Thomas >Assignee: Prabhu Joseph >Priority: Major > Attachments: IllegalResponseChrome.png > > > When a spark job throws an exception with a message containing a character > out of the range supported by xml 1.0, then > the application fails and the stack trace will be stored into the > {{diagnostics}} field. So far, so good. > But the issue occurred when we try to get application information with the > ResourceManager REST API > The xml response will contain the illegal xml 1.0 char and will be invalid. > *+Examples of illegals characters in xml 1.0 :+* > * \u > * \u0001 > * \u0002 > * \u0003 > * \u0004 > _For more information about supported characters :_ > [https://www.w3.org/TR/xml/#charsets] > *+Example of illegal response from the Ressource Manager API :+* > {code:xml} > > > application_1326821518301_0005 > user1 > job > a1 > FINISHED > FAILED > 100.0 > History > > http://host.domain.com:8088/proxy/application_1326821518301_0005/jobhistory/job/job_1326821518301_5_5 > Exception in thread "main" java.lang.Exception: \u0001 > at com..main(JobWithSpecialCharMain.java:6) > [...] > > {code} > > *+Example of job to reproduce :+* > {code:java} > public class JobWithSpecialCharMain { > public static void main(String[] args) throws Exception { > throw new Exception("\u0001"); > } > } > {code} > !IllegalResponseChrome.png! -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9728) ResourceManager REST API can produce an illegal xml response
[ https://issues.apache.org/jira/browse/YARN-9728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906026#comment-16906026 ] Prabhu Joseph commented on YARN-9728: - Thanks [~tde]. > ResourceManager REST API can produce an illegal xml response > - > > Key: YARN-9728 > URL: https://issues.apache.org/jira/browse/YARN-9728 > Project: Hadoop YARN > Issue Type: Bug > Components: api, resourcemanager >Affects Versions: 2.7.3 >Reporter: Thomas >Priority: Major > Attachments: IllegalResponseChrome.png > > > When a spark job throws an exception with a message containing a character > out of the range supported by xml 1.0, then > the application fails and the stack trace will be stored into the > {{diagnostics}} field. So far, so good. > But the issue occurred when we try to get application information with the > ResourceManager REST API > The xml response will contain the illegal xml 1.0 char and will be invalid. > *+Examples of illegals characters in xml 1.0 :+* > * \u > * \u0001 > * \u0002 > * \u0003 > * \u0004 > _For more information about supported characters :_ > [https://www.w3.org/TR/xml/#charsets] > *+Example of illegal response from the Ressource Manager API :+* > {code:xml} > > > application_1326821518301_0005 > user1 > job > a1 > FINISHED > FAILED > 100.0 > History > > http://host.domain.com:8088/proxy/application_1326821518301_0005/jobhistory/job/job_1326821518301_5_5 > Exception in thread "main" java.lang.Exception: \u0001 > at com..main(JobWithSpecialCharMain.java:6) > [...] > > {code} > > *+Example of job to reproduce :+* > {code:java} > public class JobWithSpecialCharMain { > public static void main(String[] args) throws Exception { > throw new Exception("\u0001"); > } > } > {code} > !IllegalResponseChrome.png! -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9290) Invalid SchedulingRequest not rejected in Scheduler PlacementConstraintsHandler
[ https://issues.apache.org/jira/browse/YARN-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9290: Attachment: YARN-9290-006.patch > Invalid SchedulingRequest not rejected in Scheduler > PlacementConstraintsHandler > > > Key: YARN-9290 > URL: https://issues.apache.org/jira/browse/YARN-9290 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9290-001.patch, YARN-9290-002.patch, > YARN-9290-003.patch, YARN-9290-004.patch, YARN-9290-005.patch, > YARN-9290-006.patch > > > SchedulingRequest with Invalid namespace is not rejected in Scheduler > PlacementConstraintsHandler. RM keeps on trying to allocateOnNode with > logging the exception. This is rejected in case of placement-processor > handler. > {code} > 2019-02-08 16:51:27,548 WARN > org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator: > Failed to query node cardinality: > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.InvalidAllocationTagsQueryException: > Invalid namespace prefix: notselfi, valid values are: > all,not-self,app-id,app-tag,self > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.TargetApplicationsNamespace.fromString(TargetApplicationsNamespace.java:277) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.TargetApplicationsNamespace.parse(TargetApplicationsNamespace.java:234) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.AllocationTags.createAllocationTags(AllocationTags.java:93) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfySingleConstraintExpression(PlacementConstraintsUtil.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfySingleConstraint(PlacementConstraintsUtil.java:240) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfyConstraints(PlacementConstraintsUtil.java:321) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfyAndConstraint(PlacementConstraintsUtil.java:272) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfyConstraints(PlacementConstraintsUtil.java:324) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfyConstraints(PlacementConstraintsUtil.java:365) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator.checkCardinalityAndPending(SingleConstraintAppPlacementAllocator.java:355) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator.precheckNode(SingleConstraintAppPlacementAllocator.java:395) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.precheckNode(AppSchedulingInfo.java:779) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.preCheckForNodeCandidateSet(RegularContainerAllocator.java:145) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.allocate(RegularContainerAllocator.java:837) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainers(RegularContainerAllocator.java:890) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.ContainerAllocator.assignContainers(ContainerAllocator.java:54) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.assignContainers(FiCaSchedulerApp.java:977) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:1173) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:795) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:623) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1630) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1624) > at > org.apache.hadoop.yarn.server.resourcemana
[jira] [Created] (YARN-9744) RollingLevelDBTimelineStore.getEntityByTime fails with NPE
Prabhu Joseph created YARN-9744: --- Summary: RollingLevelDBTimelineStore.getEntityByTime fails with NPE Key: YARN-9744 URL: https://issues.apache.org/jira/browse/YARN-9744 Project: Hadoop YARN Issue Type: Bug Components: timelineserver Affects Versions: 3.2.0, 3.3.0 Reporter: Prabhu Joseph Assignee: Prabhu Joseph RollingLevelDBTimelineStore.getEntityByTime fails with NPE. {code} 2019-08-07 12:58:55,990 WARN ipc.Server (Server.java:logException(2433)) - IPC Server handler 0 on 10200, call org.apache.hadoop.yarn.api.ApplicationHistoryProtocolPB.getContainers from 10.21.216.93:36392 Call#29446915 Retry#0 java.lang.NullPointerException at org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntityByTime(RollingLevelDBTimelineStore.java:786) at org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntities(RollingLevelDBTimelineStore.java:614) at org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getEntities(EntityGroupFSTimelineStore.java:1045) at org.apache.hadoop.yarn.server.timeline.TimelineDataManager.doGetEntities(TimelineDataManager.java:168) at org.apache.hadoop.yarn.server.timeline.TimelineDataManager.getEntities(TimelineDataManager.java:138) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainers(ApplicationHistoryManagerOnTimelineStore.java:222) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.getContainers(ApplicationHistoryClientService.java:213) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationHistoryProtocolPBServiceImpl.getContainers(ApplicationHistoryProtocolPBServiceImpl.java:172) at org.apache.hadoop.yarn.proto.ApplicationHistoryProtocol$ApplicationHistoryProtocolService$2.callBlockingMethod(ApplicationHistoryProtocol.java:201) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347) {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9744) RollingLevelDBTimelineStore.getEntityByTime fails with NPE
[ https://issues.apache.org/jira/browse/YARN-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9744: Attachment: YARN-9744-001.patch > RollingLevelDBTimelineStore.getEntityByTime fails with NPE > -- > > Key: YARN-9744 > URL: https://issues.apache.org/jira/browse/YARN-9744 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 3.2.0, 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9744-001.patch > > > RollingLevelDBTimelineStore.getEntityByTime fails with NPE. > {code} > 2019-08-07 12:58:55,990 WARN ipc.Server (Server.java:logException(2433)) - > IPC Server handler 0 on 10200, call > org.apache.hadoop.yarn.api.ApplicationHistoryProtocolPB.getContainers from > 10.21.216.93:36392 Call#29446915 Retry#0 > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntityByTime(RollingLevelDBTimelineStore.java:786) > at > org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntities(RollingLevelDBTimelineStore.java:614) > at > org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getEntities(EntityGroupFSTimelineStore.java:1045) > at > org.apache.hadoop.yarn.server.timeline.TimelineDataManager.doGetEntities(TimelineDataManager.java:168) > at > org.apache.hadoop.yarn.server.timeline.TimelineDataManager.getEntities(TimelineDataManager.java:138) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainers(ApplicationHistoryManagerOnTimelineStore.java:222) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.getContainers(ApplicationHistoryClientService.java:213) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationHistoryProtocolPBServiceImpl.getContainers(ApplicationHistoryProtocolPBServiceImpl.java:172) > at > org.apache.hadoop.yarn.proto.ApplicationHistoryProtocol$ApplicationHistoryProtocolService$2.callBlockingMethod(ApplicationHistoryProtocol.java:201) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347) > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9548) [Umbrella] Make YARN work well in elastic cloud environments
[ https://issues.apache.org/jira/browse/YARN-9548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906068#comment-16906068 ] Junping Du commented on YARN-9548: -- +1. I am quite interesting on autoscaling part. For horizontal scaling, we can leverage graceful decommission (YARN-914) to decommission/recommission nodes based on metrics monitoring. For vertical scaling, we can leverage dynamic resource allocation (YARN-291) to have a min/max resource setting on each node and update according to resource profiling of each node. > [Umbrella] Make YARN work well in elastic cloud environments > > > Key: YARN-9548 > URL: https://issues.apache.org/jira/browse/YARN-9548 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Vinod Kumar Vavilapalli >Priority: Major > > YARN works well in static environments but there isn't fundamentally broken > in YARN to stop us from making it work well in dynamic environments like > cloud (public or private) as well. > There are few areas where we need to invest though > # Autoscaling > -- cluster level: add/remove nodes intelligently based on metrics and/or > admin plugins > -- node level: scale nodes up/down vertically? > # Smarter scheduling > -- to pack containers as opposed to spreading them around to account for > nodes going away > -- to account for speculative nodes like spot instances > # Handling nodes going away better > -- by decommissioning sanely > -- dealing with auxiliary services data > # And any installation helpers in this dynamic world - scripts, operators > etc. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9743) [JDK11] TestTimelineWebServices.testContextFactory fails
[ https://issues.apache.org/jira/browse/YARN-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906090#comment-16906090 ] Wei-Chiu Chuang commented on YARN-9743: --- Hi Adam, could you try: (1) the latest openjdk 11.04 (2) list the configuration used. For example, build with jdk11 + jdk8 target + run on jdk11, build with jdk11 + jdk11 target + run on jdk11, or build with jdk8 run on jdk11. Thanks! > [JDK11] TestTimelineWebServices.testContextFactory fails > > > Key: YARN-9743 > URL: https://issues.apache.org/jira/browse/YARN-9743 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineservice >Affects Versions: 3.2.0 >Reporter: Adam Antal >Priority: Major > > Tested on OpenJDK 11.0.2 on a Mac. > Stack trace: > {noformat} > [ERROR] Tests run: 29, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: > 36.016 s <<< FAILURE! - in > org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices > [ERROR] > testContextFactory(org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices) > Time elapsed: 1.031 s <<< ERROR! > java.lang.ClassNotFoundException: com.sun.xml.internal.bind.v2.ContextFactory > at > java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583) > at > java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) > at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) > at java.base/java.lang.Class.forName0(Native Method) > at java.base/java.lang.Class.forName(Class.java:315) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.ContextFactory.newContext(ContextFactory.java:85) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.ContextFactory.createContext(ContextFactory.java:112) > at > org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices.testContextFactory(TestTimelineWebServices.java:1039) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9133) Make tests more easy to comprehend in TestGpuResourceHandler
[ https://issues.apache.org/jira/browse/YARN-9133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-9133: --- Attachment: YARN-9133.007.patch > Make tests more easy to comprehend in TestGpuResourceHandler > > > Key: YARN-9133 > URL: https://issues.apache.org/jira/browse/YARN-9133 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9133.001.patch, YARN-9133.001.patch, > YARN-9133.002.patch, YARN-9133.003.patch, YARN-9133.004.patch, > YARN-9133.005.patch, YARN-9133.006.patch, YARN-9133.006.patch, > YARN-9133.007.patch > > > Tests are not quite easy to read: > - Some more helper methods would improve readability. > - Eliminating the boolean flag that controls if docker is used would also > improve readability and clarity. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9744) RollingLevelDBTimelineStore.getEntityByTime fails with NPE
[ https://issues.apache.org/jira/browse/YARN-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906138#comment-16906138 ] Hadoop QA commented on YARN-9744: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 42s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 20s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 21s{color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 48m 39s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9744 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12977471/YARN-9744-001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux b5a6c8039e7a 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0b507d2 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24551/testReport/ | | Max. process+thread count | 417 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24551/cons
[jira] [Comment Edited] (YARN-7982) Do ACLs check while retrieving entity-types per application
[ https://issues.apache.org/jira/browse/YARN-7982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906143#comment-16906143 ] Prabhu Joseph edited comment on YARN-7982 at 8/13/19 12:32 PM: --- Thanks [~abmodi] for reviewing. 1. The {{TimelineReaderContext.getUserId()}} does not present if user does not set Query Parameter {{userId}} in Rest API {{TimelineReaderWebServices#getEntityTypes}}. The {{userId}} is fetched and set in the context at {{EntityTypes#readEntityTypes}} -> {{augmentparams}} from AppToFlowTable. Reusing the same context so that the {{TimelineReaderWebServices#getEntityTypes}} will have userid to check access. 2. In {{FileSystemTimelineReadeImpl}}, {{context.getUserId}} is null if user does not set Query Parameter {{userId}}. Setting it from the Flow Run Path. was (Author: prabhu joseph): Thanks [~abmodi] for reviewing. 1. The {{TimelineReaderContext.getUserId()}} does not present if user does not set Query Parameter {{userId}} in Rest API {{TimelineReaderWebServices#getEntityTypes}}. The {{userId}} is fetched and set in the context at {{EntityTypes#readEntityTypes}} -> {{augmentparams}} from AppToFlowTable. Reusing the same context so that the {{TimelineReaderWebServices#getEntityTypes}} will have userid to check access. 2. In {{FileSystemTimelineReadeImpl}}, {{context.getUserId}} is null if user does not set Query Parameter {{userId}}. Setting it from the Flow Run Path. > Do ACLs check while retrieving entity-types per application > --- > > Key: YARN-7982 > URL: https://issues.apache.org/jira/browse/YARN-7982 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Rohith Sharma K S >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-7982-001.patch, YARN-7982-002.patch, > YARN-7982-003.patch > > > REST end point {{/apps/$appid/entity-types}} retrieves all the entity-types > for given application. This need to be guarded with ACL check > {code} > [yarn@yarn-ats-3 ~]$ curl > "http://yarn-ats-3:8198/ws/v2/timeline/apps/application_1552297011473_0002?user.name=ambari-qa1"; > {"exception":"ForbiddenException","message":"java.lang.Exception: User > ambari-qa1 is not allowed to read TimelineService V2 > data.","javaClassName":"org.apache.hadoop.yarn.webapp.ForbiddenException"} > [yarn@yarn-ats-3 ~]$ curl > "http://yarn-ats-3:8198/ws/v2/timeline/apps/application_1552297011473_0002/entity-types?user.name=ambari-qa1"; > ["YARN_APPLICATION_ATTEMPT","YARN_CONTAINER"] > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7982) Do ACLs check while retrieving entity-types per application
[ https://issues.apache.org/jira/browse/YARN-7982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906143#comment-16906143 ] Prabhu Joseph commented on YARN-7982: - Thanks [~abmodi] for reviewing. 1. The {{TimelineReaderContext.getUserId()}} does not present if user does not set Query Parameter {{userId}} in Rest API {{TimelineReaderWebServices#getEntityTypes}}. The {{userId}} is fetched and set in the context at {{EntityTypes#readEntityTypes}} -> {{augmentparams}} from AppToFlowTable. Reusing the same context so that the {{TimelineReaderWebServices#getEntityTypes}} will have userid to check access. 2. In {{FileSystemTimelineReadeImpl}}, {{context.getUserId}} is null if user does not set Query Parameter {{userId}}. Setting it from the Flow Run Path. > Do ACLs check while retrieving entity-types per application > --- > > Key: YARN-7982 > URL: https://issues.apache.org/jira/browse/YARN-7982 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Rohith Sharma K S >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-7982-001.patch, YARN-7982-002.patch, > YARN-7982-003.patch > > > REST end point {{/apps/$appid/entity-types}} retrieves all the entity-types > for given application. This need to be guarded with ACL check > {code} > [yarn@yarn-ats-3 ~]$ curl > "http://yarn-ats-3:8198/ws/v2/timeline/apps/application_1552297011473_0002?user.name=ambari-qa1"; > {"exception":"ForbiddenException","message":"java.lang.Exception: User > ambari-qa1 is not allowed to read TimelineService V2 > data.","javaClassName":"org.apache.hadoop.yarn.webapp.ForbiddenException"} > [yarn@yarn-ats-3 ~]$ curl > "http://yarn-ats-3:8198/ws/v2/timeline/apps/application_1552297011473_0002/entity-types?user.name=ambari-qa1"; > ["YARN_APPLICATION_ATTEMPT","YARN_CONTAINER"] > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1902) Allocation of too many containers when a second request is done with the same resource capability
[ https://issues.apache.org/jira/browse/YARN-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906148#comment-16906148 ] Rick Moritz commented on YARN-1902: --- This bug actually also can cause application crashes, if the application handles "ContainerAllocated"-events by stockpiling them, and then scheduling tasks to these containers as they arrive. This usually leads to timeouts of the involved token, and very interesting guesswork, why program logic is attempting to launch containers that have been assigned obsolete tokens. I also wonder how this mixes with the recent addition of "opportunistic allocation". Hadoop 3 would have been a great opportunity to close this bug :( > Allocation of too many containers when a second request is done with the same > resource capability > - > > Key: YARN-1902 > URL: https://issues.apache.org/jira/browse/YARN-1902 > Project: Hadoop YARN > Issue Type: Bug > Components: client >Affects Versions: 2.2.0, 2.3.0, 2.4.0 >Reporter: Sietse T. Au >Assignee: Sietse T. Au >Priority: Major > Labels: client > Attachments: YARN-1902.patch, YARN-1902.v2.patch, YARN-1902.v3.patch > > > Regarding AMRMClientImpl > Scenario 1: > Given a ContainerRequest x with Resource y, when addContainerRequest is > called z times with x, allocate is called and at least one of the z allocated > containers is started, then if another addContainerRequest call is done and > subsequently an allocate call to the RM, (z+1) containers will be allocated, > where 1 container is expected. > Scenario 2: > No containers are started between the allocate calls. > Analyzing debug logs of the AMRMClientImpl, I have found that indeed a (z+1) > are requested in both scenarios, but that only in the second scenario, the > correct behavior is observed. > Looking at the implementation I have found that this (z+1) request is caused > by the structure of the remoteRequestsTable. The consequence of Map ResourceRequestInfo> is that ResourceRequestInfo does not hold any > information about whether a request has been sent to the RM yet or not. > There are workarounds for this, such as releasing the excess containers > received. > The solution implemented is to initialize a new ResourceRequest in > ResourceRequestInfo when a request has been successfully sent to the RM. > The patch includes a test in which scenario one is tested. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1902) Allocation of too many containers when a second request is done with the same resource capability
[ https://issues.apache.org/jira/browse/YARN-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906150#comment-16906150 ] Rick Moritz commented on YARN-1902: --- Oh, and Hadoop 2.7 can also be added to the affected version list. > Allocation of too many containers when a second request is done with the same > resource capability > - > > Key: YARN-1902 > URL: https://issues.apache.org/jira/browse/YARN-1902 > Project: Hadoop YARN > Issue Type: Bug > Components: client >Affects Versions: 2.2.0, 2.3.0, 2.4.0 >Reporter: Sietse T. Au >Assignee: Sietse T. Au >Priority: Major > Labels: client > Attachments: YARN-1902.patch, YARN-1902.v2.patch, YARN-1902.v3.patch > > > Regarding AMRMClientImpl > Scenario 1: > Given a ContainerRequest x with Resource y, when addContainerRequest is > called z times with x, allocate is called and at least one of the z allocated > containers is started, then if another addContainerRequest call is done and > subsequently an allocate call to the RM, (z+1) containers will be allocated, > where 1 container is expected. > Scenario 2: > No containers are started between the allocate calls. > Analyzing debug logs of the AMRMClientImpl, I have found that indeed a (z+1) > are requested in both scenarios, but that only in the second scenario, the > correct behavior is observed. > Looking at the implementation I have found that this (z+1) request is caused > by the structure of the remoteRequestsTable. The consequence of Map ResourceRequestInfo> is that ResourceRequestInfo does not hold any > information about whether a request has been sent to the RM yet or not. > There are workarounds for this, such as releasing the excess containers > received. > The solution implemented is to initialize a new ResourceRequest in > ResourceRequestInfo when a request has been successfully sent to the RM. > The patch includes a test in which scenario one is tested. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9133) Make tests more easy to comprehend in TestGpuResourceHandler
[ https://issues.apache.org/jira/browse/YARN-9133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906170#comment-16906170 ] Hadoop QA commented on YARN-9133: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 53s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 0 new + 2 unchanged - 3 fixed = 2 total (was 5) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 50s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 44s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 71m 24s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9133 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12977473/YARN-9133.007.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 1e7aaede5ba6 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0b507d2 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24552/testReport/ | | Max. process+thread count | 418 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24552/console | | Powered by | Apache Yetu
[jira] [Updated] (YARN-9744) RollingLevelDBTimelineStore.getEntityByTime fails with NPE
[ https://issues.apache.org/jira/browse/YARN-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9744: Description: RollingLevelDBTimelineStore.getEntityByTime fails with NPE. {code} 2019-08-07 12:58:55,990 WARN ipc.Server (Server.java:logException(2433)) - IPC Server handler 0 on 10200, call org.apache.hadoop.yarn.api.ApplicationHistoryProtocolPB.getContainers from 10.21.216.93:36392 Call#29446915 Retry#0 java.lang.NullPointerException at org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntityByTime(RollingLevelDBTimelineStore.java:786) at org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntities(RollingLevelDBTimelineStore.java:614) at org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getEntities(EntityGroupFSTimelineStore.java:1045) at org.apache.hadoop.yarn.server.timeline.TimelineDataManager.doGetEntities(TimelineDataManager.java:168) at org.apache.hadoop.yarn.server.timeline.TimelineDataManager.getEntities(TimelineDataManager.java:138) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainers(ApplicationHistoryManagerOnTimelineStore.java:222) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.getContainers(ApplicationHistoryClientService.java:213) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationHistoryProtocolPBServiceImpl.getContainers(ApplicationHistoryProtocolPBServiceImpl.java:172) at org.apache.hadoop.yarn.proto.ApplicationHistoryProtocol$ApplicationHistoryProtocolService$2.callBlockingMethod(ApplicationHistoryProtocol.java:201) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347) {code} This affects Rest Api to get entities. curl http://pjosephdocker:8188/ws/v1/timeline/TEZ_APPLICATION was: RollingLevelDBTimelineStore.getEntityByTime fails with NPE. {code} 2019-08-07 12:58:55,990 WARN ipc.Server (Server.java:logException(2433)) - IPC Server handler 0 on 10200, call org.apache.hadoop.yarn.api.ApplicationHistoryProtocolPB.getContainers from 10.21.216.93:36392 Call#29446915 Retry#0 java.lang.NullPointerException at org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntityByTime(RollingLevelDBTimelineStore.java:786) at org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntities(RollingLevelDBTimelineStore.java:614) at org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getEntities(EntityGroupFSTimelineStore.java:1045) at org.apache.hadoop.yarn.server.timeline.TimelineDataManager.doGetEntities(TimelineDataManager.java:168) at org.apache.hadoop.yarn.server.timeline.TimelineDataManager.getEntities(TimelineDataManager.java:138) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainers(ApplicationHistoryManagerOnTimelineStore.java:222) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.getContainers(ApplicationHistoryClientService.java:213) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationHistoryProtocolPBServiceImpl.getContainers(ApplicationHistoryProtocolPBServiceImpl.java:172) at org.apache.hadoop.yarn.proto.ApplicationHistoryProtocol$ApplicationHistoryProtocolService$2.callBlockingMethod(ApplicationHistoryProtocol.java:201) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347) {code} > RollingLevelDBTimelineStore.getEntityByTime fails with NPE > -- > > Key: YARN-9744 > URL: https://issues.apache.org/jira/browse/YARN-9744 >
[jira] [Commented] (YARN-9744) RollingLevelDBTimelineStore.getEntityByTime fails with NPE
[ https://issues.apache.org/jira/browse/YARN-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906177#comment-16906177 ] Prabhu Joseph commented on YARN-9744: - [~abmodi] Can you review this Jira when you get time. This fixes NPE thrown by RollingLevelDBTimelineStore.getEntityByTime. > RollingLevelDBTimelineStore.getEntityByTime fails with NPE > -- > > Key: YARN-9744 > URL: https://issues.apache.org/jira/browse/YARN-9744 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 3.2.0, 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9744-001.patch > > > RollingLevelDBTimelineStore.getEntityByTime fails with NPE. > {code} > 2019-08-07 12:58:55,990 WARN ipc.Server (Server.java:logException(2433)) - > IPC Server handler 0 on 10200, call > org.apache.hadoop.yarn.api.ApplicationHistoryProtocolPB.getContainers from > 10.21.216.93:36392 Call#29446915 Retry#0 > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntityByTime(RollingLevelDBTimelineStore.java:786) > at > org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntities(RollingLevelDBTimelineStore.java:614) > at > org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getEntities(EntityGroupFSTimelineStore.java:1045) > at > org.apache.hadoop.yarn.server.timeline.TimelineDataManager.doGetEntities(TimelineDataManager.java:168) > at > org.apache.hadoop.yarn.server.timeline.TimelineDataManager.getEntities(TimelineDataManager.java:138) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainers(ApplicationHistoryManagerOnTimelineStore.java:222) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.getContainers(ApplicationHistoryClientService.java:213) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationHistoryProtocolPBServiceImpl.getContainers(ApplicationHistoryProtocolPBServiceImpl.java:172) > at > org.apache.hadoop.yarn.proto.ApplicationHistoryProtocol$ApplicationHistoryProtocolService$2.callBlockingMethod(ApplicationHistoryProtocol.java:201) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347) > {code} > This affects Rest Api to get entities. > curl http://pjosephdocker:8188/ws/v1/timeline/TEZ_APPLICATION -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9444) YARN API ResourceUtils's getRequestedResourcesFromConfig doesn't recognize yarn.io/gpu as a valid resource
[ https://issues.apache.org/jira/browse/YARN-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906183#comment-16906183 ] Hadoop QA commented on YARN-9444: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 27s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 32s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 16s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 35s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 26s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 19s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 15 unchanged - 0 fixed = 17 total (was 15) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 10s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 30s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 54s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 50s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 9s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}100m 8s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 | | JIRA Issue | YARN-9444 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12964993/YARN-9444.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux c9b6bcfe2a4b 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0b507d2 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Bui
[jira] [Commented] (YARN-9744) RollingLevelDBTimelineStore.getEntityByTime fails with NPE
[ https://issues.apache.org/jira/browse/YARN-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906205#comment-16906205 ] Abhishek Modi commented on YARN-9744: - Thanks [~Prabhu Joseph] for the patch. LGTM. Committed to trunk. > RollingLevelDBTimelineStore.getEntityByTime fails with NPE > -- > > Key: YARN-9744 > URL: https://issues.apache.org/jira/browse/YARN-9744 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 3.2.0, 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9744-001.patch > > > RollingLevelDBTimelineStore.getEntityByTime fails with NPE. > {code} > 2019-08-07 12:58:55,990 WARN ipc.Server (Server.java:logException(2433)) - > IPC Server handler 0 on 10200, call > org.apache.hadoop.yarn.api.ApplicationHistoryProtocolPB.getContainers from > 10.21.216.93:36392 Call#29446915 Retry#0 > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntityByTime(RollingLevelDBTimelineStore.java:786) > at > org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntities(RollingLevelDBTimelineStore.java:614) > at > org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getEntities(EntityGroupFSTimelineStore.java:1045) > at > org.apache.hadoop.yarn.server.timeline.TimelineDataManager.doGetEntities(TimelineDataManager.java:168) > at > org.apache.hadoop.yarn.server.timeline.TimelineDataManager.getEntities(TimelineDataManager.java:138) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainers(ApplicationHistoryManagerOnTimelineStore.java:222) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.getContainers(ApplicationHistoryClientService.java:213) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationHistoryProtocolPBServiceImpl.getContainers(ApplicationHistoryProtocolPBServiceImpl.java:172) > at > org.apache.hadoop.yarn.proto.ApplicationHistoryProtocol$ApplicationHistoryProtocolService$2.callBlockingMethod(ApplicationHistoryProtocol.java:201) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347) > {code} > This affects Rest Api to get entities. > curl http://pjosephdocker:8188/ws/v1/timeline/TEZ_APPLICATION -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9728) ResourceManager REST API can produce an illegal xml response
[ https://issues.apache.org/jira/browse/YARN-9728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas updated YARN-9728: - Description: When a spark job throws an exception with a message containing a character out of the range supported by xml 1.0, then the application fails and the stack trace will be stored into the {{diagnostics}} field. So far, so good. But the issue occurred when we try to get application information with the ResourceManager REST API The xml response will contain the illegal xml 1.0 char and will be invalid. *+Examples of illegals characters in xml 1.0 :+* * {{\u}} * {{\u0001}} * {{\u0002}} * {{\u0003}} * {{\u0004}} _For more information about supported characters :_ [https://www.w3.org/TR/xml/#charsets] *+Example of illegal response from the Ressource Manager API :+* {code:xml} application_1326821518301_0005 user1 job a1 FINISHED FAILED 100.0 History http://host.domain.com:8088/proxy/application_1326821518301_0005/jobhistory/job/job_1326821518301_5_5 Exception in thread "main" java.lang.Exception: \u0001 at com..main(JobWithSpecialCharMain.java:6) [...] {code} *+Example of job to reproduce :+* {code:java} public class JobWithSpecialCharMain { public static void main(String[] args) throws Exception { throw new Exception("\u0001"); } } {code} !IllegalResponseChrome.png! was: When a spark job throws an exception with a message containing a character out of the range supported by xml 1.0, then the application fails and the stack trace will be stored into the {{diagnostics}} field. So far, so good. But the issue occurred when we try to get application information with the ResourceManager REST API The xml response will contain the illegal xml 1.0 char and will be invalid. *+Examples of illegals characters in xml 1.0 :+* * \u * \u0001 * \u0002 * \u0003 * \u0004 _For more information about supported characters :_ [https://www.w3.org/TR/xml/#charsets] *+Example of illegal response from the Ressource Manager API :+* {code:xml} application_1326821518301_0005 user1 job a1 FINISHED FAILED 100.0 History http://host.domain.com:8088/proxy/application_1326821518301_0005/jobhistory/job/job_1326821518301_5_5 Exception in thread "main" java.lang.Exception: \u0001 at com..main(JobWithSpecialCharMain.java:6) [...] {code} *+Example of job to reproduce :+* {code:java} public class JobWithSpecialCharMain { public static void main(String[] args) throws Exception { throw new Exception("\u0001"); } } {code} !IllegalResponseChrome.png! > ResourceManager REST API can produce an illegal xml response > - > > Key: YARN-9728 > URL: https://issues.apache.org/jira/browse/YARN-9728 > Project: Hadoop YARN > Issue Type: Bug > Components: api, resourcemanager >Affects Versions: 2.7.3 >Reporter: Thomas >Assignee: Prabhu Joseph >Priority: Major > Attachments: IllegalResponseChrome.png > > > When a spark job throws an exception with a message containing a character > out of the range supported by xml 1.0, then > the application fails and the stack trace will be stored into the > {{diagnostics}} field. So far, so good. > But the issue occurred when we try to get application information with the > ResourceManager REST API > The xml response will contain the illegal xml 1.0 char and will be invalid. > *+Examples of illegals characters in xml 1.0 :+* > * {{\u}} > * {{\u0001}} > * {{\u0002}} > * {{\u0003}} > * {{\u0004}} > _For more information about supported characters :_ > [https://www.w3.org/TR/xml/#charsets] > *+Example of illegal response from the Ressource Manager API :+* > {code:xml} > > > application_1326821518301_0005 > user1 > job > a1 > FINISHED > FAILED > 100.0 > History > > http://host.domain.com:8088/proxy/application_1326821518301_0005/jobhistory/job/job_1326821518301_5_5 > Exception in thread "main" java.lang.Exception: \u0001 > at com..main(JobWithSpecialCharMain.java:6) > [...] > > {code} > > *+Example of job to reproduce :+* > {code:java} > public class JobWithSpecialCharMain { > public static void main(String[] args) throws Exception { > throw new Exception("\u0001"); > } > } > {code} > !IllegalResponseChrome.png! -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9290) Invalid SchedulingRequest not rejected in Scheduler PlacementConstraintsHandler
[ https://issues.apache.org/jira/browse/YARN-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906211#comment-16906211 ] Hadoop QA commented on YARN-9290: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 9 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 24s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 630 unchanged - 3 fixed = 630 total (was 633) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 18s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 79m 27s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}128m 54s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9290 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12977467/YARN-9290-006.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux f2890caea321 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0b507d2 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24549/testReport/ | | Max. process+thread count | 901 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24549/console | | Po
[jira] [Commented] (YARN-9676) Add DEBUG and TRACE level messages to AppLogAggregatorImpl and connected classes
[ https://issues.apache.org/jira/browse/YARN-9676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906220#comment-16906220 ] Adam Antal commented on YARN-9676: -- Thanks! > Add DEBUG and TRACE level messages to AppLogAggregatorImpl and connected > classes > > > Key: YARN-9676 > URL: https://issues.apache.org/jira/browse/YARN-9676 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Adam Antal >Assignee: Adam Antal >Priority: Major > > During the development of the last items of YARN-6875, it was typically > difficult to extract information about the internal state of some log > aggregation related classes (e.g. {{AppLogAggregatiorImpl}} and > {{LogAggregationFileController}}). > On my fork I added a few more messages to those classes like: > - displaying the number of log aggregation cycles > - displaying the names of the files currently considered for log aggregation > by containers > - immediately displaying any exception caught (and sent to the RM in the > diagnostic messages) during the log aggregation process. > Those messages were quite useful for debugging if any issue occurs, but > otherwise it flooded the NM log file with these messages that are usually not > needed. I suggest to add (some of) these messages in DEBUG or TRACE level. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9744) RollingLevelDBTimelineStore.getEntityByTime fails with NPE
[ https://issues.apache.org/jira/browse/YARN-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906224#comment-16906224 ] Hudson commented on YARN-9744: -- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17100 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17100/]) YARN-9744. RollingLevelDBTimelineStore.getEntityByTime fails with NPE. (abmodi: rev b4097b96a39bad6214b01989e7f2fb37dad70793) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/RollingLevelDBTimelineStore.java > RollingLevelDBTimelineStore.getEntityByTime fails with NPE > -- > > Key: YARN-9744 > URL: https://issues.apache.org/jira/browse/YARN-9744 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 3.2.0, 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9744-001.patch > > > RollingLevelDBTimelineStore.getEntityByTime fails with NPE. > {code} > 2019-08-07 12:58:55,990 WARN ipc.Server (Server.java:logException(2433)) - > IPC Server handler 0 on 10200, call > org.apache.hadoop.yarn.api.ApplicationHistoryProtocolPB.getContainers from > 10.21.216.93:36392 Call#29446915 Retry#0 > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntityByTime(RollingLevelDBTimelineStore.java:786) > at > org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntities(RollingLevelDBTimelineStore.java:614) > at > org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getEntities(EntityGroupFSTimelineStore.java:1045) > at > org.apache.hadoop.yarn.server.timeline.TimelineDataManager.doGetEntities(TimelineDataManager.java:168) > at > org.apache.hadoop.yarn.server.timeline.TimelineDataManager.getEntities(TimelineDataManager.java:138) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainers(ApplicationHistoryManagerOnTimelineStore.java:222) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.getContainers(ApplicationHistoryClientService.java:213) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationHistoryProtocolPBServiceImpl.getContainers(ApplicationHistoryProtocolPBServiceImpl.java:172) > at > org.apache.hadoop.yarn.proto.ApplicationHistoryProtocol$ApplicationHistoryProtocolService$2.callBlockingMethod(ApplicationHistoryProtocol.java:201) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347) > {code} > This affects Rest Api to get entities. > curl http://pjosephdocker:8188/ws/v1/timeline/TEZ_APPLICATION -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9133) Make tests more easy to comprehend in TestGpuResourceHandler
[ https://issues.apache.org/jira/browse/YARN-9133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-9133: --- Attachment: YARN-9133.branch-3.2.001.patch > Make tests more easy to comprehend in TestGpuResourceHandler > > > Key: YARN-9133 > URL: https://issues.apache.org/jira/browse/YARN-9133 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9133.001.patch, YARN-9133.001.patch, > YARN-9133.002.patch, YARN-9133.003.patch, YARN-9133.004.patch, > YARN-9133.005.patch, YARN-9133.006.patch, YARN-9133.006.patch, > YARN-9133.007.patch, YARN-9133.branch-3.2.001.patch > > > Tests are not quite easy to read: > - Some more helper methods would improve readability. > - Eliminating the boolean flag that controls if docker is used would also > improve readability and clarity. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9133) Make tests more easy to comprehend in TestGpuResourceHandler
[ https://issues.apache.org/jira/browse/YARN-9133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906266#comment-16906266 ] Peter Bacsko commented on YARN-9133: Uploaded patch for branch-3.2. Again, patch for branch-3.1 is not trivial, there are too many conflicts - I'd skip it. > Make tests more easy to comprehend in TestGpuResourceHandler > > > Key: YARN-9133 > URL: https://issues.apache.org/jira/browse/YARN-9133 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9133.001.patch, YARN-9133.001.patch, > YARN-9133.002.patch, YARN-9133.003.patch, YARN-9133.004.patch, > YARN-9133.005.patch, YARN-9133.006.patch, YARN-9133.006.patch, > YARN-9133.007.patch, YARN-9133.branch-3.2.001.patch > > > Tests are not quite easy to read: > - Some more helper methods would improve readability. > - Eliminating the boolean flag that controls if docker is used would also > improve readability and clarity. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9744) RollingLevelDBTimelineStore.getEntityByTime fails with NPE
[ https://issues.apache.org/jira/browse/YARN-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906287#comment-16906287 ] Prabhu Joseph commented on YARN-9744: - Thanks [~abmodi]. > RollingLevelDBTimelineStore.getEntityByTime fails with NPE > -- > > Key: YARN-9744 > URL: https://issues.apache.org/jira/browse/YARN-9744 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 3.2.0, 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9744-001.patch > > > RollingLevelDBTimelineStore.getEntityByTime fails with NPE. > {code} > 2019-08-07 12:58:55,990 WARN ipc.Server (Server.java:logException(2433)) - > IPC Server handler 0 on 10200, call > org.apache.hadoop.yarn.api.ApplicationHistoryProtocolPB.getContainers from > 10.21.216.93:36392 Call#29446915 Retry#0 > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntityByTime(RollingLevelDBTimelineStore.java:786) > at > org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntities(RollingLevelDBTimelineStore.java:614) > at > org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getEntities(EntityGroupFSTimelineStore.java:1045) > at > org.apache.hadoop.yarn.server.timeline.TimelineDataManager.doGetEntities(TimelineDataManager.java:168) > at > org.apache.hadoop.yarn.server.timeline.TimelineDataManager.getEntities(TimelineDataManager.java:138) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainers(ApplicationHistoryManagerOnTimelineStore.java:222) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.getContainers(ApplicationHistoryClientService.java:213) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationHistoryProtocolPBServiceImpl.getContainers(ApplicationHistoryProtocolPBServiceImpl.java:172) > at > org.apache.hadoop.yarn.proto.ApplicationHistoryProtocol$ApplicationHistoryProtocolService$2.callBlockingMethod(ApplicationHistoryProtocol.java:201) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347) > {code} > This affects Rest Api to get entities. > curl http://pjosephdocker:8188/ws/v1/timeline/TEZ_APPLICATION -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9562) Add Java changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-9562: -- Attachment: YARN-9562.003.patch > Add Java changes for the new RuncContainerRuntime > - > > Key: YARN-9562 > URL: https://issues.apache.org/jira/browse/YARN-9562 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9562.001.patch, YARN-9562.002.patch, > YARN-9562.003.patch > > > This JIRA will be used to add the Java changes for the new > RuncContainerRuntime. This will work off of YARN-9560 to use much of the > existing DockerLinuxContainerRuntime code once it is moved up into an > abstract class that can be extended. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9562) Add Java changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906295#comment-16906295 ] Eric Badger commented on YARN-9562: --- [~eyang], yes that's the error. I caught this error awhile ago, but I guess I never uploaded a patch with the fix. Patch 003 fixes the issue. > Add Java changes for the new RuncContainerRuntime > - > > Key: YARN-9562 > URL: https://issues.apache.org/jira/browse/YARN-9562 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9562.001.patch, YARN-9562.002.patch, > YARN-9562.003.patch > > > This JIRA will be used to add the Java changes for the new > RuncContainerRuntime. This will work off of YARN-9560 to use much of the > existing DockerLinuxContainerRuntime code once it is moved up into an > abstract class that can be extended. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9217) Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing
[ https://issues.apache.org/jira/browse/YARN-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-9217: --- Attachment: YARN-9217.009.patch > Nodemanager will fail to start if GPU is misconfigured on the node or GPU > drivers missing > - > > Key: YARN-9217 > URL: https://issues.apache.org/jira/browse/YARN-9217 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9217.001.patch, YARN-9217.002.patch, > YARN-9217.003.patch, YARN-9217.004.patch, YARN-9217.005.patch, > YARN-9217.006.patch, YARN-9217.007.patch, YARN-9217.008.patch, > YARN-9217.009.patch > > > Nodemanager will not start > 1. If Autodiscovery is enabled: > * If nvidia-smi path is misconfigured or the file does not exist. > * There is 0 GPU found > * If the file exists but it is not pointing to an nvidia-smi > * if the binary is ok but there is an IOException > 2. If the manually configured GPU devices are misconfigured > * Any index:minor number format failure will cause a problem > * 0 configured device will cause a problem > * NumberFormatException is not handled > It would be a better option to add warnings about the configuration, set 0 > available GPUs and let the node work and run non-gpu jobs. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9745) TestFairScheduler.testIncreaseQueueMaxRunningAppsOnTheFly fails intermittent
Prabhu Joseph created YARN-9745: --- Summary: TestFairScheduler.testIncreaseQueueMaxRunningAppsOnTheFly fails intermittent Key: YARN-9745 URL: https://issues.apache.org/jira/browse/YARN-9745 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler, test Affects Versions: 3.3.0 Reporter: Prabhu Joseph Assignee: Prabhu Joseph TestFairScheduler.testIncreaseQueueMaxRunningAppsOnTheFly fails intermittent {code} [ERROR] testIncreaseQueueMaxRunningAppsOnTheFly(org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler) Time elapsed: 0.003 s <<< ERROR! org.junit.runners.model.TestTimedOutException: test timed out after 5000 milliseconds at java.io.FileOutputStream.open0(Native Method) at java.io.FileOutputStream.open(FileOutputStream.java:270) at java.io.FileOutputStream.(FileOutputStream.java:213) at java.io.FileOutputStream.(FileOutputStream.java:101) at java.io.FileWriter.(FileWriter.java:63) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testIncreaseQueueSettingOnTheFlyInternal(TestFairScheduler.java:2394) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testIncreaseQueueMaxRunningAppsOnTheFly(TestFairScheduler.java:2357) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9133) Make tests more easy to comprehend in TestGpuResourceHandler
[ https://issues.apache.org/jira/browse/YARN-9133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906330#comment-16906330 ] Hadoop QA commented on YARN-9133: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 49s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-3.2 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 17s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 6s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 58s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} branch-3.2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 0 new + 1 unchanged - 3 fixed = 1 total (was 4) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 30s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 33s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 42s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 75m 15s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:63396beab41 | | JIRA Issue | YARN-9133 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12977495/YARN-9133.branch-3.2.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 25412e954cd4 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 08:28:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-3.2 / c5f433b | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24553/testReport/ | | asflicense | https://builds.apache.org/job/PreCommit-YARN-Build/24553/artifact/out/patch-asflicense-problems.txt | | Max. process+thread count | 306 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn
[jira] [Commented] (YARN-9561) Add C changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906360#comment-16906360 ] Eric Yang commented on YARN-9561: - [~ebadger] This patch no longer applies to trunk. {code} [WARNING] /home/eyang/test/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/oci/oci.c: In function ‘run_oci_container’: [WARNING] /home/eyang/test/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/oci/oci.c:849:8: error: ‘ERROR_OCI_RUN_FAILED’ undeclared (first use in this function) [WARNING]rc = ERROR_OCI_RUN_FAILED; [WARNING] ^ [WARNING] /home/eyang/test/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/oci/oci.c:849:8: note: each undeclared identifier is reported only once for each function it appears in [WARNING] make[2]: *** [CMakeFiles/container.dir/main/native/container-executor/impl/oci/oci.c.o] Error 1 [WARNING] make[2]: *** Waiting for unfinished jobs [WARNING] /home/eyang/test/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/oci/oci_reap.c: In function ‘reap_oci_layer_mounts_with_ctx’: [WARNING] /home/eyang/test/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/oci/oci_reap.c:566:12: error: ‘ERROR_OCI_REAP_LAYER_MOUNTS_FAILED’ undeclared (first use in this function) [WARNING]int rc = ERROR_OCI_REAP_LAYER_MOUNTS_FAILED; [WARNING] ^ [WARNING] /home/eyang/test/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/oci/oci_reap.c:566:12: note: each undeclared identifier is reported only once for each function it appears in [WARNING] /home/eyang/test/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/oci/oci_reap.c: In function ‘reap_oci_layer_mounts’: [WARNING] /home/eyang/test/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/oci/oci_reap.c:613:12: error: ‘ERROR_OCI_REAP_LAYER_MOUNTS_FAILED’ undeclared (first use in this function) [WARNING]int rc = ERROR_OCI_REAP_LAYER_MOUNTS_FAILED; [WARNING] ^ [WARNING] make[2]: *** [CMakeFiles/container.dir/main/native/container-executor/impl/oci/oci_reap.c.o] Error 1 [WARNING] make[1]: *** [CMakeFiles/container.dir/all] Error 2 [WARNING] make: *** [all] Error 2 {code} Could you check? Thanks > Add C changes for the new RuncContainerRuntime > -- > > Key: YARN-9561 > URL: https://issues.apache.org/jira/browse/YARN-9561 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9561.001.patch, YARN-9561.002.patch > > > This JIRA will be used to add the C changes to the container-executor native > binary that are necessary for the new RuncContainerRuntime. There should be > no changes to existing code paths. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9442) container working directory has group read permissions
[ https://issues.apache.org/jira/browse/YARN-9442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906403#comment-16906403 ] Hudson commented on YARN-9442: -- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17103 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17103/]) YARN-9442. container working directory has group read permissions. (ebadger: rev 2ac029b949f041da2ee04da441c5f9f85e1f2c64) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/test-container-executor.c * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c > container working directory has group read permissions > -- > > Key: YARN-9442 > URL: https://issues.apache.org/jira/browse/YARN-9442 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.2.2 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Minor > Attachments: YARN-9442.001.patch, YARN-9442.002.patch, > YARN-9442.003.patch > > > Container working directories are currently created with permissions 0750, > owned by the user and with the group set to the node manager group. > Is there any reason why these directories need group read permissions? > I have been testing with group read permissions removed and so far I haven't > encountered any problems. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9217) Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing
[ https://issues.apache.org/jira/browse/YARN-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906422#comment-16906422 ] Hadoop QA commented on YARN-9217: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 33s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 0 new + 16 unchanged - 2 fixed = 16 total (was 18) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 5s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 57s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 71m 29s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.amrmproxy.TestFederationInterceptor | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9217 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12977502/YARN-9217.009.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux cea1da4f9d0f 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 274966e | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/24555/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24555/testReport/ | | Max. process+thread count | 448 (vs. ulimit of 1) | | modul
[jira] [Updated] (YARN-9561) Add C changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-9561: -- Attachment: YARN-9561.003.patch > Add C changes for the new RuncContainerRuntime > -- > > Key: YARN-9561 > URL: https://issues.apache.org/jira/browse/YARN-9561 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9561.001.patch, YARN-9561.002.patch, > YARN-9561.003.patch > > > This JIRA will be used to add the C changes to the container-executor native > binary that are necessary for the new RuncContainerRuntime. There should be > no changes to existing code paths. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9561) Add C changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906441#comment-16906441 ] Eric Badger commented on YARN-9561: --- Yea another error code was added, which messed up the ERROR_OCI_RUN_FAILED error code from the previous patch. Updated the patch to fix that as well as some other stuff. > Add C changes for the new RuncContainerRuntime > -- > > Key: YARN-9561 > URL: https://issues.apache.org/jira/browse/YARN-9561 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9561.001.patch, YARN-9561.002.patch, > YARN-9561.003.patch > > > This JIRA will be used to add the C changes to the container-executor native > binary that are necessary for the new RuncContainerRuntime. There should be > no changes to existing code paths. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9562) Add Java changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906452#comment-16906452 ] Hadoop QA commented on YARN-9562: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 36s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 10 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 16s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 34s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 37s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 25s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 283 new + 689 unchanged - 1 fixed = 972 total (was 690) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 17s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 16s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager generated 16 new + 0 unchanged - 0 fixed = 16 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 57s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m 5s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}100m 13s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | | Nullcheck of NodeManager.context at line 535 of value previously dereferenced in org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStop() At NodeManager.java:535 of value previously dereferenced in org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStop() At NodeManager.java:[line 532] | | | Unused field:NodeManager.java | | | Dead store to refreshHdfsCacheThread in org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.ImageTagToManifestPlugin.serviceStart() At ImageTa
[jira] [Updated] (YARN-9442) container working directory has group read permissions
[ https://issues.apache.org/jira/browse/YARN-9442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-9442: -- Fix Version/s: 3.2.2 3.1.3 2.9.3 3.3.0 3.0.4 2.10.0 +1 lgtm. I have committed this to trunk, branch-3.2, branch-3.1, branch-3.0, branch-2, and branch-2.9. [~Jim_Brennan], could you upload a branch-2.8 patch? There were some small naming conflicts that I cleaned up in branch-3.2 and branch-2, but this one is a little bit more. > container working directory has group read permissions > -- > > Key: YARN-9442 > URL: https://issues.apache.org/jira/browse/YARN-9442 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.2.2 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Minor > Fix For: 2.10.0, 3.0.4, 3.3.0, 2.9.3, 3.1.3, 3.2.2 > > Attachments: YARN-9442.001.patch, YARN-9442.002.patch, > YARN-9442.003.patch > > > Container working directories are currently created with permissions 0750, > owned by the user and with the group set to the node manager group. > Is there any reason why these directories need group read permissions? > I have been testing with group read permissions removed and so far I haven't > encountered any problems. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9442) container working directory has group read permissions
[ https://issues.apache.org/jira/browse/YARN-9442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906463#comment-16906463 ] Jim Brennan commented on YARN-9442: --- Thanks [~ebadger]! I will put up a patch for 2.8. > container working directory has group read permissions > -- > > Key: YARN-9442 > URL: https://issues.apache.org/jira/browse/YARN-9442 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.2.2 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Minor > Fix For: 2.10.0, 3.0.4, 3.3.0, 2.9.3, 3.1.3, 3.2.2 > > Attachments: YARN-9442.001.patch, YARN-9442.002.patch, > YARN-9442.003.patch > > > Container working directories are currently created with permissions 0750, > owned by the user and with the group set to the node manager group. > Is there any reason why these directories need group read permissions? > I have been testing with group read permissions removed and so far I haven't > encountered any problems. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9442) container working directory has group read permissions
[ https://issues.apache.org/jira/browse/YARN-9442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Brennan updated YARN-9442: -- Attachment: YARN-9442-branch-2.8.001.patch > container working directory has group read permissions > -- > > Key: YARN-9442 > URL: https://issues.apache.org/jira/browse/YARN-9442 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.2.2 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Minor > Fix For: 2.10.0, 3.0.4, 3.3.0, 2.9.3, 3.1.3, 3.2.2 > > Attachments: YARN-9442-branch-2.8.001.patch, YARN-9442.001.patch, > YARN-9442.002.patch, YARN-9442.003.patch > > > Container working directories are currently created with permissions 0750, > owned by the user and with the group set to the node manager group. > Is there any reason why these directories need group read permissions? > I have been testing with group read permissions removed and so far I haven't > encountered any problems. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9442) container working directory has group read permissions
[ https://issues.apache.org/jira/browse/YARN-9442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906589#comment-16906589 ] Hadoop QA commented on YARN-9442: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 8m 34s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-2.8 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 19s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_222 {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} branch-2.8 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 41s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 20s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 31m 14s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:b93746a | | JIRA Issue | YARN-9442 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12977536/YARN-9442-branch-2.8.001.patch | | Optional Tests | dupname asflicense compile cc mvnsite javac unit | | uname | Linux 2eec3dbb2182 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-2.8 / 829afac | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | Multi-JDK versions | /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 /usr/lib/jvm/java-8-openjdk-amd64:1.8.0_222 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24557/testReport/ | | asflicense | https://builds.apache.org/job/PreCommit-YARN-Build/24557/artifact/out/patch-asflicense-problems.txt | | Max. process+thread count | 174 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24557/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > container working directory has group read permissions > -- > > Key: YARN-9442 > URL: https://issues.apache.org/jira/browse/YARN-9442 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >
[jira] [Updated] (YARN-9442) container working directory has group read permissions
[ https://issues.apache.org/jira/browse/YARN-9442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-9442: -- Fix Version/s: 2.8.6 Thanks, [~Jim_Brennan]! I committed the 2.8 patch to branch-2.8 > container working directory has group read permissions > -- > > Key: YARN-9442 > URL: https://issues.apache.org/jira/browse/YARN-9442 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.2.2 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Minor > Fix For: 2.10.0, 3.0.4, 3.3.0, 2.8.6, 2.9.3, 3.1.3, 3.2.2 > > Attachments: YARN-9442-branch-2.8.001.patch, YARN-9442.001.patch, > YARN-9442.002.patch, YARN-9442.003.patch > > > Container working directories are currently created with permissions 0750, > owned by the user and with the group set to the node manager group. > Is there any reason why these directories need group read permissions? > I have been testing with group read permissions removed and so far I haven't > encountered any problems. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9442) container working directory has group read permissions
[ https://issues.apache.org/jira/browse/YARN-9442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906625#comment-16906625 ] Jim Brennan commented on YARN-9442: --- Thanks [~ebadger]! > container working directory has group read permissions > -- > > Key: YARN-9442 > URL: https://issues.apache.org/jira/browse/YARN-9442 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.2.2 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Minor > Fix For: 2.10.0, 3.0.4, 3.3.0, 2.8.6, 2.9.3, 3.1.3, 3.2.2 > > Attachments: YARN-9442-branch-2.8.001.patch, YARN-9442.001.patch, > YARN-9442.002.patch, YARN-9442.003.patch > > > Container working directories are currently created with permissions 0750, > owned by the user and with the group set to the node manager group. > Is there any reason why these directories need group read permissions? > I have been testing with group read permissions removed and so far I haven't > encountered any problems. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9561) Add C changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906652#comment-16906652 ] Hadoop QA commented on YARN-9561: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 15s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 13m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 60m 50s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 15m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 13m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 3s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}135m 13s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 2s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}255m 14s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestLeaseRecovery2 | | | hadoop.hdfs.web.TestWebHDFS | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9561 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12977521/YARN-9561.003.patch | | Optional Tests | dupname asflicense compile cc mvnsite javac unit | | uname | Linux 55c31dcaf6ed 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 2ac029b | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/24556/artifact/out/patch-unit-root.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24556/testReport/ | | Max. process+thread count | 5092 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager . U: . | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24556/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Add C changes for the new RuncContainerRuntime > -- > > Key: YARN-9561 > URL: https://issues.apache.org/jira/browse/YARN-9561 > Project: Hadoop YARN > Issue Type: Sub-ta
[jira] [Commented] (YARN-9562) Add Java changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906663#comment-16906663 ] Eric Yang commented on YARN-9562: - [~ebadger] 1. Node manager crashes if the defined images-tag-to-hash-files does not exist. It would be nice, if this is a warning instead. {code} java.lang.RuntimeException: Couldn't load any image-tag-to-hash-files at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.ImageTagToManifestPlugin.serviceStart(ImageTagToManifestPlugin.java:315) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.RuncContainerRuntime.start(RuncContainerRuntime.java:277) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.start(DelegatingLinuxContainerRuntime.java:283) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.start(LinuxContainerExecutor.java:351) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:519) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:989) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1069) 2019-08-13 21:07:03,002 INFO org.apache.hadoop.service.AbstractService: Service NodeManager failed in state INITED java.lang.RuntimeException: Couldn't load any image-tag-to-hash-files at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.ImageTagToManifestPlugin.serviceStart(ImageTagToManifestPlugin.java:315) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.RuncContainerRuntime.start(RuncContainerRuntime.java:277) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.start(DelegatingLinuxContainerRuntime.java:283) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.start(LinuxContainerExecutor.java:351) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:519) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:989) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1069) 2019-08-13 21:07:03,003 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NodeManager metrics system... 2019-08-13 21:07:03,003 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system stopped. 2019-08-13 21:07:03,004 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system shutdown complete. 2019-08-13 21:07:03,005 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager java.lang.RuntimeException: Couldn't load any image-tag-to-hash-files at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.ImageTagToManifestPlugin.serviceStart(ImageTagToManifestPlugin.java:315) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.RuncContainerRuntime.start(RuncContainerRuntime.java:277) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.start(DelegatingLinuxContainerRuntime.java:283) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.start(LinuxContainerExecutor.java:351) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:519) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:989) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1069) 2019-08-13 21:07:03,016 INFO org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG: {code} 2. Running mapreduce job using runc container, the patch still reference to incorrect path: {code} java.io.IOException: java.util.concurrent.ExecutionException: java.io.FileNotFoundException: File does not exist: hdfs://eyang-1.openstacklocal:9000/user/yarn/null/config/9f38484d220fa527b1fb19747638497179500a1bed8bf0498eb788229229e6e1 at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.HdfsManifestToResourcesPlugin.getResource(HdfsManifestToResourcesPlugin.java:180) at
[jira] [Assigned] (YARN-9106) Add option to graceful decommission to not wait for applications
[ https://issues.apache.org/jira/browse/YARN-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang reassigned YARN-9106: - Assignee: Mikayla Konst > Add option to graceful decommission to not wait for applications > > > Key: YARN-9106 > URL: https://issues.apache.org/jira/browse/YARN-9106 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Mikayla Konst >Assignee: Mikayla Konst >Priority: Major > Attachments: YARN-9106.patch > > > Add property > yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-applications. > If true (the default), the resource manager waits for all containers, as well > as all applications associated with those containers, to finish before > gracefully decommissioning a node. > If false, the resource manager only waits for containers, but not > applications, to finish. For map-only jobs or other jobs in which mappers do > not need to serve shuffle data, this allows nodes to be decommissioned as > soon as their containers are finished as opposed to when the job is done. > Add property > yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-app-masters. > If false, during graceful decommission, when the resource manager waits for > all containers on a node to finish, it will not wait for app master > containers to finish. Defaults to true. This property should only be set to > false if app master failure is recoverable. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-9101) Recovery Container exitCode Not Right
[ https://issues.apache.org/jira/browse/YARN-9101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang reassigned YARN-9101: - Assignee: SuperbDong > Recovery Container exitCode Not Right > - > > Key: YARN-9101 > URL: https://issues.apache.org/jira/browse/YARN-9101 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: SuperbDong >Assignee: SuperbDong >Priority: Major > Labels: pull-request-available > Attachments: YARN-9101.1.patch, YARN-9101.patch > > > It's correct exitCode when container launch nomally,but it is not correct if > the container by recovery. > Out of memory exitCode is -104, the exitCode had to be lost when the > container was recovered by restart NodeManager. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-9610) HeartbeatCallBack int FederationInterceptor clear AMRMToken in response from UAM should before add to aysncResponseSink
[ https://issues.apache.org/jira/browse/YARN-9610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang reassigned YARN-9610: - Assignee: Morty Zhong > HeartbeatCallBack int FederationInterceptor clear AMRMToken in response from > UAM should before add to aysncResponseSink > > > Key: YARN-9610 > URL: https://issues.apache.org/jira/browse/YARN-9610 > Project: Hadoop YARN > Issue Type: Bug > Components: amrmproxy, federation >Affects Versions: 3.2.0 >Reporter: Morty Zhong >Assignee: Morty Zhong >Priority: Major > Attachments: YARN-9610.patch.1, YARN-9610.patch.2 > > > in federation, `allocate` is async. the response from RM is cached in > `asyncResponseSink`. > the final allocate response is merged from all RMs allocate response. merge > will throw exception when AMRMToken from UAM response is not null. > But set AMRMToken from UAM response to null is not in the scope of lock. so > there will be a change merge see that AMRMToken from UAM response is not > null. > so we should clear the token before add response to asyncResponseSink > > > {code:java} > synchronized (asyncResponseSink) { > List responses = null; > if (asyncResponseSink.containsKey(subClusterId)) { > responses = asyncResponseSink.get(subClusterId); > } else { > responses = new ArrayList<>(); > asyncResponseSink.put(subClusterId, responses); > } > responses.add(response); > // Notify main thread about the response arrival > asyncResponseSink.notifyAll(); > } > ... > if (this.isUAM && response.getAMRMToken() != null) { > Token newToken = ConverterUtils > .convertFromYarn(response.getAMRMToken(), (Text) null); > // Do not further propagate the new amrmToken for UAM > response.setAMRMToken(null); > ...{code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9106) Add option to graceful decommission to not wait for applications
[ https://issues.apache.org/jira/browse/YARN-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906755#comment-16906755 ] Hadoop QA commented on YARN-9106: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 7s{color} | {color:red} YARN-9106 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-9106 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12951275/YARN-9106.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24558/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Add option to graceful decommission to not wait for applications > > > Key: YARN-9106 > URL: https://issues.apache.org/jira/browse/YARN-9106 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Mikayla Konst >Assignee: Mikayla Konst >Priority: Major > Attachments: YARN-9106.patch > > > Add property > yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-applications. > If true (the default), the resource manager waits for all containers, as well > as all applications associated with those containers, to finish before > gracefully decommissioning a node. > If false, the resource manager only waits for containers, but not > applications, to finish. For map-only jobs or other jobs in which mappers do > not need to serve shuffle data, this allows nodes to be decommissioned as > soon as their containers are finished as opposed to when the job is done. > Add property > yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-app-masters. > If false, during graceful decommission, when the resource manager waits for > all containers on a node to finish, it will not wait for app master > containers to finish. Defaults to true. This property should only be set to > false if app master failure is recoverable. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9101) Recovery Container exitCode Not Right
[ https://issues.apache.org/jira/browse/YARN-9101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906804#comment-16906804 ] Hadoop QA commented on YARN-9101: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 22m 59s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 6s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 21s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 10 new + 6 unchanged - 0 fixed = 16 total (was 6) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 27s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 55s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 49s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}100m 58s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9101 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12951928/YARN-9101.1.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux d2d2fcf8aa97 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / e6d240d | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/24559/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24559/testReport/ | | asflicense | https://builds.apache.org/job/PreCommit-YARN-Build/24559/artifact/out/patch-asflicense-problems.txt | | Max. process+thread count | 401 (vs. ulimit of 1) | | mod
[jira] [Updated] (YARN-9106) Add option to graceful decommission to not wait for applications
[ https://issues.apache.org/jira/browse/YARN-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhankun Tang updated YARN-9106: --- Issue Type: Sub-task (was: Improvement) Parent: YARN-914 > Add option to graceful decommission to not wait for applications > > > Key: YARN-9106 > URL: https://issues.apache.org/jira/browse/YARN-9106 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Mikayla Konst >Assignee: Mikayla Konst >Priority: Major > Attachments: YARN-9106.patch > > > Add property > yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-applications. > If true (the default), the resource manager waits for all containers, as well > as all applications associated with those containers, to finish before > gracefully decommissioning a node. > If false, the resource manager only waits for containers, but not > applications, to finish. For map-only jobs or other jobs in which mappers do > not need to serve shuffle data, this allows nodes to be decommissioned as > soon as their containers are finished as opposed to when the job is done. > Add property > yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-app-masters. > If false, during graceful decommission, when the resource manager waits for > all containers on a node to finish, it will not wait for app master > containers to finish. Defaults to true. This property should only be set to > false if app master failure is recoverable. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9746) Rm should only rewrite the jobConf passed by app when supporting multi-cluster token renew
Junfan Zhang created YARN-9746: -- Summary: Rm should only rewrite the jobConf passed by app when supporting multi-cluster token renew Key: YARN-9746 URL: https://issues.apache.org/jira/browse/YARN-9746 Project: Hadoop YARN Issue Type: Improvement Reporter: Junfan Zhang This issue links to YARN-5910. When to support multi-cluster delegation token renew, the path of YARN-5910 works in most scenarios. But when intergrating with Oozie, we encounter some problems. In Oozie having multi delegation tokens including HDFS_DELEGATION_TOKEN(another cluster HA token) and MR_DELEGATION_TOKEN(Oozie mr launcher token), to support renew another cluster's token, YARN-5910 was patched and related config was set. The config is as follows {code:xml} mapreduce.job.send-token-conf dfs.namenode.kerberos.principal|dfs.nameservices|^dfs.namenode.rpc-address.*$|^dfs.ha.namenodes.*$|^dfs.client.failover.proxy.provider.*$ dfs.nameservices hadoop-clusterA-ns01,hadoop-clusterA-ns02,hadoop-clusterA-ns03,hadoop-clusterA-ns04,hadoop-clusterB-ns01,hadoop-clusterB-ns02,hadoop-clusterB-ns03,hadoop-clusterB-ns04 dfs.ha.namenodes.hadoop-clusterB-ns01 nn1,nn2 dfs.namenode.rpc-address.hadoop-clusterB-ns01.nn1 namenode01-clusterB.qiyi.hadoop:8020 dfs.namenode.rpc-address.hadoop-clusterB-ns01.nn2 namenode02-clusterB.qiyi.hadoop:8020 dfs.client.failover.proxy.provider.hadoop-clusterB-ns01 org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider {code} However, the MR_DELEGATION_TOKEN could‘t be renewed, because of lacking some config. Although we can set the required configurations through the app, this is not a good idea. So i think rm should only rewrite the jobConf passed by app to solve the above situation. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9106) Add option to graceful decommission to not wait for applications
[ https://issues.apache.org/jira/browse/YARN-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906856#comment-16906856 ] Sunil Govindan commented on YARN-9106: -- wait-for-applications and wait-for-app-masters Expecting below behaviour: 1. wait-for-applications: by default, suggestion is to set TRUE. This means no matter the containers are done, still node cannot be decommissioned, as some apps may be still running. This is true in case of MR, How about other apps?. Such as services, or tez or spark? I think we need to consider the reason why we need to hold node for longer time based on type containers/apps each node has ran. 2. wait-for-app-masters: This config will be helpful inorder to force kill AM containers to decommission a node faster. Thinking out loud, this is an aggressive config, howver default is turned off. Hence i think its fine to have this. > Add option to graceful decommission to not wait for applications > > > Key: YARN-9106 > URL: https://issues.apache.org/jira/browse/YARN-9106 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Mikayla Konst >Assignee: Mikayla Konst >Priority: Major > Attachments: YARN-9106.patch > > > Add property > yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-applications. > If true (the default), the resource manager waits for all containers, as well > as all applications associated with those containers, to finish before > gracefully decommissioning a node. > If false, the resource manager only waits for containers, but not > applications, to finish. For map-only jobs or other jobs in which mappers do > not need to serve shuffle data, this allows nodes to be decommissioned as > soon as their containers are finished as opposed to when the job is done. > Add property > yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-app-masters. > If false, during graceful decommission, when the resource manager waits for > all containers on a node to finish, it will not wait for app master > containers to finish. Defaults to true. This property should only be set to > false if app master failure is recoverable. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9106) Add option to graceful decommission to not wait for applications
[ https://issues.apache.org/jira/browse/YARN-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906857#comment-16906857 ] Sunil Govindan commented on YARN-9106: -- cc [~leftnoteasy] [~tangzhankun] > Add option to graceful decommission to not wait for applications > > > Key: YARN-9106 > URL: https://issues.apache.org/jira/browse/YARN-9106 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Mikayla Konst >Assignee: Mikayla Konst >Priority: Major > Attachments: YARN-9106.patch > > > Add property > yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-applications. > If true (the default), the resource manager waits for all containers, as well > as all applications associated with those containers, to finish before > gracefully decommissioning a node. > If false, the resource manager only waits for containers, but not > applications, to finish. For map-only jobs or other jobs in which mappers do > not need to serve shuffle data, this allows nodes to be decommissioned as > soon as their containers are finished as opposed to when the job is done. > Add property > yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-app-masters. > If false, during graceful decommission, when the resource manager waits for > all containers on a node to finish, it will not wait for app master > containers to finish. Defaults to true. This property should only be set to > false if app master failure is recoverable. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-6003) yarn-ui build failure caused by debug 2.4.0
[ https://issues.apache.org/jira/browse/YARN-6003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka resolved YARN-6003. - Resolution: Not A Problem Now this issue is not a problem. Closing. > yarn-ui build failure caused by debug 2.4.0 > --- > > Key: YARN-6003 > URL: https://issues.apache.org/jira/browse/YARN-6003 > Project: Hadoop YARN > Issue Type: Bug > Components: build, yarn-ui-v2 >Reporter: Akira Ajisaka >Priority: Minor > > The recent build failure: > https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/255/artifact/out/patch-compile-root.txt > {noformat} > /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/target/src/main/webapp/node_modules/debug/debug.js:126 > debug.color = selectColor(namespae); > ^ > ReferenceError: namespae is not defined > at createDebug > (/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/target/src/main/webapp/node_modules/debug/debug.js:126:29) > at Object. > (/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/target/src/main/webapp/node_modules/ember-cli/lib/models/project.js:16:43) > at Module._compile (module.js:456:26) > at Object.Module._extensions..js (module.js:474:10) > at Module.load (module.js:356:32) > at Function.Module._load (module.js:312:12) > at Module.require (module.js:364:17) > at require (module.js:380:17) > at Object. > (/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/target/src/main/webapp/node_modules/ember-cli/lib/cli/index.js:4:21) > at Module._compile (module.js:456:26) > {noformat} > build@2.4.0 is broken. https://github.com/visionmedia/debug/issues/347 > Maybe we need to set the version to 2.4.1 explicitly. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9746) Rm should only rewrite the jobConf passed by app when supporting multi-cluster token renew
[ https://issues.apache.org/jira/browse/YARN-9746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junfan Zhang updated YARN-9746: --- Attachment: YARN-9746-01.path > Rm should only rewrite the jobConf passed by app when supporting > multi-cluster token renew > -- > > Key: YARN-9746 > URL: https://issues.apache.org/jira/browse/YARN-9746 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Junfan Zhang >Priority: Major > Attachments: YARN-9746-01.path > > > This issue links to YARN-5910. > When to support multi-cluster delegation token renew, the path of YARN-5910 > works in most scenarios. > But when intergrating with Oozie, we encounter some problems. In Oozie having > multi delegation tokens including HDFS_DELEGATION_TOKEN(another cluster HA > token) and MR_DELEGATION_TOKEN(Oozie mr launcher token), to support renew > another cluster's token, YARN-5910 was patched and related config was set. > The config is as follows > {code:xml} > > mapreduce.job.send-token-conf > > dfs.namenode.kerberos.principal|dfs.nameservices|^dfs.namenode.rpc-address.*$|^dfs.ha.namenodes.*$|^dfs.client.failover.proxy.provider.*$ > > > dfs.nameservices > > hadoop-clusterA-ns01,hadoop-clusterA-ns02,hadoop-clusterA-ns03,hadoop-clusterA-ns04,hadoop-clusterB-ns01,hadoop-clusterB-ns02,hadoop-clusterB-ns03,hadoop-clusterB-ns04 > > > dfs.ha.namenodes.hadoop-clusterB-ns01 > nn1,nn2 > > > > dfs.namenode.rpc-address.hadoop-clusterB-ns01.nn1 > namenode01-clusterB.qiyi.hadoop:8020 > > > > dfs.namenode.rpc-address.hadoop-clusterB-ns01.nn2 > namenode02-clusterB.qiyi.hadoop:8020 > > > > dfs.client.failover.proxy.provider.hadoop-clusterB-ns01 > > org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider > > {code} > However, the MR_DELEGATION_TOKEN could‘t be renewed, because of lacking some > config. Although we can set the required configurations through the app, this > is not a good idea. So i think rm should only rewrite the jobConf passed by > app to solve the above situation. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9746) Rm should only rewrite partial jobConf passed by app when supporting multi-cluster token renew
[ https://issues.apache.org/jira/browse/YARN-9746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junfan Zhang updated YARN-9746: --- Summary: Rm should only rewrite partial jobConf passed by app when supporting multi-cluster token renew (was: Rm should only rewrite the jobConf passed by app when supporting multi-cluster token renew) > Rm should only rewrite partial jobConf passed by app when supporting > multi-cluster token renew > -- > > Key: YARN-9746 > URL: https://issues.apache.org/jira/browse/YARN-9746 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Junfan Zhang >Priority: Major > Attachments: YARN-9746-01.path > > > This issue links to YARN-5910. > When to support multi-cluster delegation token renew, the path of YARN-5910 > works in most scenarios. > But when intergrating with Oozie, we encounter some problems. In Oozie having > multi delegation tokens including HDFS_DELEGATION_TOKEN(another cluster HA > token) and MR_DELEGATION_TOKEN(Oozie mr launcher token), to support renew > another cluster's token, YARN-5910 was patched and related config was set. > The config is as follows > {code:xml} > > mapreduce.job.send-token-conf > > dfs.namenode.kerberos.principal|dfs.nameservices|^dfs.namenode.rpc-address.*$|^dfs.ha.namenodes.*$|^dfs.client.failover.proxy.provider.*$ > > > dfs.nameservices > > hadoop-clusterA-ns01,hadoop-clusterA-ns02,hadoop-clusterA-ns03,hadoop-clusterA-ns04,hadoop-clusterB-ns01,hadoop-clusterB-ns02,hadoop-clusterB-ns03,hadoop-clusterB-ns04 > > > dfs.ha.namenodes.hadoop-clusterB-ns01 > nn1,nn2 > > > > dfs.namenode.rpc-address.hadoop-clusterB-ns01.nn1 > namenode01-clusterB.qiyi.hadoop:8020 > > > > dfs.namenode.rpc-address.hadoop-clusterB-ns01.nn2 > namenode02-clusterB.qiyi.hadoop:8020 > > > > dfs.client.failover.proxy.provider.hadoop-clusterB-ns01 > > org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider > > {code} > However, the MR_DELEGATION_TOKEN could‘t be renewed, because of lacking some > config. Although we can set the required configurations through the app, this > is not a good idea. So i think rm should only rewrite the jobConf passed by > app to solve the above situation. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9080) Bucket Directories as part of ATS done accumulates
[ https://issues.apache.org/jira/browse/YARN-9080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906884#comment-16906884 ] Bibin A Chundatt commented on YARN-9080: Thank you [~Prabhu Joseph] for working on this Have a query regarding this . Sorry to come in really late {code} while (iter.hasNext()) { FileStatus stat = iter.next(); Path clusterTimeStampPath = stat.getPath(); if (isValidClusterTimeStampDir(clusterTimeStampPath)) { MutableBoolean appLogDirPresent = new MutableBoolean(false); {code} { fs.getFileStatus(clusterTimeStampPath);}} in *isValidClusterTimeStampDir** creates additional Namenode RPC call. Can we pass the FileStatus instead of path .. {{if (isValidClusterTimeStampDir(clusterTimeStampPath))}} to reduce Namenode RPC call.. Thoughts?? > Bucket Directories as part of ATS done accumulates > -- > > Key: YARN-9080 > URL: https://issues.apache.org/jira/browse/YARN-9080 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: 0001-YARN-9080.patch, 0002-YARN-9080.patch, > 0003-YARN-9080.patch, YARN-9080-004.patch, YARN-9080-005.patch, > YARN-9080-006.patch, YARN-9080-007.patch, YARN-9080-008.patch > > > Have observed older bucket directories cluster_timestamp, bucket1 and bucket2 > as part of ATS done accumulates. The cleanLogs part of EntityLogCleaner > removes only the app directories and not the bucket directories. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9080) Bucket Directories as part of ATS done accumulates
[ https://issues.apache.org/jira/browse/YARN-9080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9080: Attachment: YARN-9080.addendum-001.patch > Bucket Directories as part of ATS done accumulates > -- > > Key: YARN-9080 > URL: https://issues.apache.org/jira/browse/YARN-9080 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: 0001-YARN-9080.patch, 0002-YARN-9080.patch, > 0003-YARN-9080.patch, YARN-9080-004.patch, YARN-9080-005.patch, > YARN-9080-006.patch, YARN-9080-007.patch, YARN-9080-008.patch, > YARN-9080.addendum-001.patch > > > Have observed older bucket directories cluster_timestamp, bucket1 and bucket2 > as part of ATS done accumulates. The cleanLogs part of EntityLogCleaner > removes only the app directories and not the bucket directories. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9080) Bucket Directories as part of ATS done accumulates
[ https://issues.apache.org/jira/browse/YARN-9080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906907#comment-16906907 ] Prabhu Joseph commented on YARN-9080: - Thanks [~bibinchundatt], have adapted the changes in addendum patch [^YARN-9080.addendum-001.patch] . Can you review the changes when you get time. Thanks. > Bucket Directories as part of ATS done accumulates > -- > > Key: YARN-9080 > URL: https://issues.apache.org/jira/browse/YARN-9080 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: 0001-YARN-9080.patch, 0002-YARN-9080.patch, > 0003-YARN-9080.patch, YARN-9080-004.patch, YARN-9080-005.patch, > YARN-9080-006.patch, YARN-9080-007.patch, YARN-9080-008.patch, > YARN-9080.addendum-001.patch > > > Have observed older bucket directories cluster_timestamp, bucket1 and bucket2 > as part of ATS done accumulates. The cleanLogs part of EntityLogCleaner > removes only the app directories and not the bucket directories. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9080) Bucket Directories as part of ATS done accumulates
[ https://issues.apache.org/jira/browse/YARN-9080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906913#comment-16906913 ] Bibin A Chundatt commented on YARN-9080: [~Prabhu Joseph] Thank you for updating the patch.. Could you handle in new JIRA.. > Bucket Directories as part of ATS done accumulates > -- > > Key: YARN-9080 > URL: https://issues.apache.org/jira/browse/YARN-9080 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: 0001-YARN-9080.patch, 0002-YARN-9080.patch, > 0003-YARN-9080.patch, YARN-9080-004.patch, YARN-9080-005.patch, > YARN-9080-006.patch, YARN-9080-007.patch, YARN-9080-008.patch, > YARN-9080.addendum-001.patch > > > Have observed older bucket directories cluster_timestamp, bucket1 and bucket2 > as part of ATS done accumulates. The cleanLogs part of EntityLogCleaner > removes only the app directories and not the bucket directories. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9747) Reduce additional namenode call by EntityGroupFSTimelineStore#cleanLogs
Prabhu Joseph created YARN-9747: --- Summary: Reduce additional namenode call by EntityGroupFSTimelineStore#cleanLogs Key: YARN-9747 URL: https://issues.apache.org/jira/browse/YARN-9747 Project: Hadoop YARN Issue Type: Bug Components: timelineserver Affects Versions: 3.3.0 Reporter: Prabhu Joseph Assignee: Prabhu Joseph EntityGroupFSTimelineStore#cleanLogs creates additional Namenode RPC call. {code} cleanLogs: while (iter.hasNext()) { FileStatus stat = iter.next(); Path clusterTimeStampPath = stat.getPath(); if (isValidClusterTimeStampDir(clusterTimeStampPath)) { MutableBoolean appLogDirPresent = new MutableBoolean(false); { fs.getFileStatus(clusterTimeStampPath);}} in isValidClusterTimeStampDir* creates additional Namenode RPC call. {code} cc [~bibinchundatt] -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9747) Reduce additional namenode call by EntityGroupFSTimelineStore#cleanLogs
[ https://issues.apache.org/jira/browse/YARN-9747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9747: Attachment: YARN-9747-001.patch > Reduce additional namenode call by EntityGroupFSTimelineStore#cleanLogs > --- > > Key: YARN-9747 > URL: https://issues.apache.org/jira/browse/YARN-9747 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9747-001.patch > > > EntityGroupFSTimelineStore#cleanLogs creates additional Namenode RPC call. > {code} > cleanLogs: > while (iter.hasNext()) { > FileStatus stat = iter.next(); > Path clusterTimeStampPath = stat.getPath(); > if (isValidClusterTimeStampDir(clusterTimeStampPath)) { > MutableBoolean appLogDirPresent = new MutableBoolean(false); > { fs.getFileStatus(clusterTimeStampPath);}} in isValidClusterTimeStampDir* > creates additional Namenode RPC call. > {code} > cc [~bibinchundatt] -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9080) Bucket Directories as part of ATS done accumulates
[ https://issues.apache.org/jira/browse/YARN-9080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906921#comment-16906921 ] Prabhu Joseph commented on YARN-9080: - [~bibinchundatt] Yes, have reported YARN-9747 and submitted patch. Thanks. > Bucket Directories as part of ATS done accumulates > -- > > Key: YARN-9080 > URL: https://issues.apache.org/jira/browse/YARN-9080 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: 0001-YARN-9080.patch, 0002-YARN-9080.patch, > 0003-YARN-9080.patch, YARN-9080-004.patch, YARN-9080-005.patch, > YARN-9080-006.patch, YARN-9080-007.patch, YARN-9080-008.patch, > YARN-9080.addendum-001.patch > > > Have observed older bucket directories cluster_timestamp, bucket1 and bucket2 > as part of ATS done accumulates. The cleanLogs part of EntityLogCleaner > removes only the app directories and not the bucket directories. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9735) Allow User Keytab to submit YARN Native Service
[ https://issues.apache.org/jira/browse/YARN-9735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906926#comment-16906926 ] Prabhu Joseph commented on YARN-9735: - Hi [~eyang], Have missed to check the reason for Yarn Native Service supporting only service principal with hostname (YARN-8571). Have seen application users starts to test the YARN Native Service with their user keytab and face the above issue. Then they debug and find it requires a service keytab and which has to be created for every host on Dev, Test, Prod clusters and populated. I think this affects the usability for new users. Have tested with user keytab and it is also working fine. Do you think if it is fine to allow user keytab. If not, will close this jira as invalid. > Allow User Keytab to submit YARN Native Service > > > Key: YARN-9735 > URL: https://issues.apache.org/jira/browse/YARN-9735 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn-native-services >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > > Yarn Native Service launch fails on a secure cluster with user keytab. It > allows only service keytab. Have seen most of the users test their jobs with > user keytab. > {code} > [ambari-qa@pjosephdocker-3 ~]$ yarn app -launch sleeper-service > /usr/hdp/3.0.1.0-187/hadoop-yarn/yarn-service-examples/sleeper/sleeper.json > 19/08/03 17:17:04 ERROR client.ApiServiceClient: Kerberos principal > (ambari-qa-pjosephdoc...@docker.com) does not contain a hostname. > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9747) Reduce additional namenode call by EntityGroupFSTimelineStore#cleanLogs
[ https://issues.apache.org/jira/browse/YARN-9747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906931#comment-16906931 ] Bibin A Chundatt commented on YARN-9747: Thank you [~Prabhu Joseph] for patch +1 LGTM will wait for jenkins results > Reduce additional namenode call by EntityGroupFSTimelineStore#cleanLogs > --- > > Key: YARN-9747 > URL: https://issues.apache.org/jira/browse/YARN-9747 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9747-001.patch > > > EntityGroupFSTimelineStore#cleanLogs creates additional Namenode RPC call. > {code} > cleanLogs: > while (iter.hasNext()) { > FileStatus stat = iter.next(); > Path clusterTimeStampPath = stat.getPath(); > if (isValidClusterTimeStampDir(clusterTimeStampPath)) { > MutableBoolean appLogDirPresent = new MutableBoolean(false); > { fs.getFileStatus(clusterTimeStampPath);}} in isValidClusterTimeStampDir* > creates additional Namenode RPC call. > {code} > cc [~bibinchundatt] -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org