[jira] [Commented] (YARN-9749) TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk
[ https://issues.apache.org/jira/browse/YARN-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908792#comment-16908792 ] Szilard Nemeth commented on YARN-9749: -- Thanks [~adam.antal] for this fix, commited to trunk and to branch-3.2 Thanks [~pbacsko] for the review! > TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk > > > Key: YARN-9749 > URL: https://issues.apache.org/jira/browse/YARN-9749 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation, test >Reporter: Peter Bacsko >Assignee: Adam Antal >Priority: Major > Attachments: YARN-9749.001.patch > > > TestAppLogAggregatorImpl#testDFSQuotaExceeded currently fails on trunk. It > was most likely introduced by YARN-9676 (resetting HEAD to the previous > commit and then re-running the test passes). > {noformat} > [INFO] Running > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl > [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.781 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl > [ERROR] > testDFSQuotaExceeded(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl) > Time elapsed: 2.361 s <<< FAILURE! > java.lang.AssertionError: The set of paths for deletion are not the same as > expected: actual size: 0 vs expected size: 1 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.verifyFilesToDelete(TestAppLogAggregatorImpl.java:344) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.access$000(TestAppLogAggregatorImpl.java:82) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:330) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:319) > at > org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:39) > at > org.mockito.internal.handler.MockHandlerImpl.handle(MockHandlerImpl.java:96) > at > org.mockito.internal.handler.NullResultGuardian.handle(NullResultGuardian.java:29) > at > org.mockito.internal.handler.InvocationNotifierHandler.handle(InvocationNotifierHandler.java:35) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:61) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:49) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor$DispatcherDefaultingToRealMethod.interceptSuperCallable(MockMethodInterceptor.java:108) > at > org.apache.hadoop.yarn.server.nodemanager.DeletionService$MockitoMock$1879282050.delete(Unknown > Source) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregationPostCleanUp(AppLogAggregatorImpl.java:556) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:476) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.testDFSQuotaExceeded(TestAppLogAggregatorImpl.java:469) > ... > {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5857) TestLogAggregationService.testFixedSizeThreadPool fails intermittently on trunk
[ https://issues.apache.org/jira/browse/YARN-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908791#comment-16908791 ] Bibin A Chundatt commented on YARN-5857: Thank you [~BilwaST] for updating the patch. +1 LGTM. Will wait for [~adam.antal] comments too. > TestLogAggregationService.testFixedSizeThreadPool fails intermittently on > trunk > --- > > Key: YARN-5857 > URL: https://issues.apache.org/jira/browse/YARN-5857 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Saxena >Assignee: Bilwa S T >Priority: Minor > Attachments: YARN-5857-001.patch, YARN-5857-002.patch, > testFixedSizeThreadPool failure reproduction > > > {noformat} > testFixedSizeThreadPool(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService) > Time elapsed: 0.11 sec <<< FAILURE! > java.lang.AssertionError: expected:<3> but was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testFixedSizeThreadPool(TestLogAggregationService.java:1139) > {noformat} > Refer to https://builds.apache.org/job/PreCommit-YARN-Build/13829/testReport/ -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5857) TestLogAggregationService.testFixedSizeThreadPool fails intermittently on trunk
[ https://issues.apache.org/jira/browse/YARN-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-5857: Attachment: YARN-5857-002.patch > TestLogAggregationService.testFixedSizeThreadPool fails intermittently on > trunk > --- > > Key: YARN-5857 > URL: https://issues.apache.org/jira/browse/YARN-5857 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Saxena >Assignee: Bilwa S T >Priority: Minor > Attachments: YARN-5857-001.patch, YARN-5857-002.patch, > testFixedSizeThreadPool failure reproduction > > > {noformat} > testFixedSizeThreadPool(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService) > Time elapsed: 0.11 sec <<< FAILURE! > java.lang.AssertionError: expected:<3> but was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testFixedSizeThreadPool(TestLogAggregationService.java:1139) > {noformat} > Refer to https://builds.apache.org/job/PreCommit-YARN-Build/13829/testReport/ -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-2599) Standby RM should also expose some jmx and metrics
[ https://issues.apache.org/jira/browse/YARN-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil Govindan updated YARN-2599: - Release Note: YARN /jmx URL end points will be accessible per resource manager process. Hence there will not be any redirection to active resource manager while accessing /jmx endpoints. > Standby RM should also expose some jmx and metrics > -- > > Key: YARN-2599 > URL: https://issues.apache.org/jira/browse/YARN-2599 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.5.1, 2.7.3, 3.0.0-alpha1 >Reporter: Karthik Kambatla >Assignee: Rohith Sharma K S >Priority: Major > Attachments: YARN-2599.002.patch, YARN-2599.patch > > > YARN-1898 redirects jmx and metrics to the Active. As discussed there, we > need to separate out metrics displayed so the Standby RM can also be > monitored. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5857) TestLogAggregationService.testFixedSizeThreadPool fails intermittently on trunk
[ https://issues.apache.org/jira/browse/YARN-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-5857: Attachment: (was: YARN-5857-002.patch) > TestLogAggregationService.testFixedSizeThreadPool fails intermittently on > trunk > --- > > Key: YARN-5857 > URL: https://issues.apache.org/jira/browse/YARN-5857 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Saxena >Assignee: Bilwa S T >Priority: Minor > Attachments: YARN-5857-001.patch, testFixedSizeThreadPool failure > reproduction > > > {noformat} > testFixedSizeThreadPool(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService) > Time elapsed: 0.11 sec <<< FAILURE! > java.lang.AssertionError: expected:<3> but was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testFixedSizeThreadPool(TestLogAggregationService.java:1139) > {noformat} > Refer to https://builds.apache.org/job/PreCommit-YARN-Build/13829/testReport/ -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2599) Standby RM should also expose some jmx and metrics
[ https://issues.apache.org/jira/browse/YARN-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908779#comment-16908779 ] Sunil Govindan commented on YARN-2599: -- cc [~cheersyang] New patch is added. > Standby RM should also expose some jmx and metrics > -- > > Key: YARN-2599 > URL: https://issues.apache.org/jira/browse/YARN-2599 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.5.1, 2.7.3, 3.0.0-alpha1 >Reporter: Karthik Kambatla >Assignee: Rohith Sharma K S >Priority: Major > Attachments: YARN-2599.002.patch, YARN-2599.patch > > > YARN-1898 redirects jmx and metrics to the Active. As discussed there, we > need to separate out metrics displayed so the Standby RM can also be > monitored. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-2599) Standby RM should also expose some jmx and metrics
[ https://issues.apache.org/jira/browse/YARN-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil Govindan updated YARN-2599: - Attachment: YARN-2599.002.patch > Standby RM should also expose some jmx and metrics > -- > > Key: YARN-2599 > URL: https://issues.apache.org/jira/browse/YARN-2599 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.5.1, 2.7.3, 3.0.0-alpha1 >Reporter: Karthik Kambatla >Assignee: Rohith Sharma K S >Priority: Major > Attachments: YARN-2599.002.patch, YARN-2599.patch > > > YARN-1898 redirects jmx and metrics to the Active. As discussed there, we > need to separate out metrics displayed so the Standby RM can also be > monitored. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5857) TestLogAggregationService.testFixedSizeThreadPool fails intermittently on trunk
[ https://issues.apache.org/jira/browse/YARN-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908771#comment-16908771 ] Bilwa S T commented on YARN-5857: - Hi [~adam.antal] latch-countDown should be put before tryLock as tryLock retries for 35secs. > TestLogAggregationService.testFixedSizeThreadPool fails intermittently on > trunk > --- > > Key: YARN-5857 > URL: https://issues.apache.org/jira/browse/YARN-5857 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Saxena >Assignee: Bilwa S T >Priority: Minor > Attachments: YARN-5857-001.patch, YARN-5857-002.patch, > testFixedSizeThreadPool failure reproduction > > > {noformat} > testFixedSizeThreadPool(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService) > Time elapsed: 0.11 sec <<< FAILURE! > java.lang.AssertionError: expected:<3> but was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testFixedSizeThreadPool(TestLogAggregationService.java:1139) > {noformat} > Refer to https://builds.apache.org/job/PreCommit-YARN-Build/13829/testReport/ -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5857) TestLogAggregationService.testFixedSizeThreadPool fails intermittently on trunk
[ https://issues.apache.org/jira/browse/YARN-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-5857: Attachment: YARN-5857-002.patch > TestLogAggregationService.testFixedSizeThreadPool fails intermittently on > trunk > --- > > Key: YARN-5857 > URL: https://issues.apache.org/jira/browse/YARN-5857 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Saxena >Assignee: Bilwa S T >Priority: Minor > Attachments: YARN-5857-001.patch, YARN-5857-002.patch, > testFixedSizeThreadPool failure reproduction > > > {noformat} > testFixedSizeThreadPool(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService) > Time elapsed: 0.11 sec <<< FAILURE! > java.lang.AssertionError: expected:<3> but was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testFixedSizeThreadPool(TestLogAggregationService.java:1139) > {noformat} > Refer to https://builds.apache.org/job/PreCommit-YARN-Build/13829/testReport/ -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-9753) Cache Pre-Priming
[ https://issues.apache.org/jira/browse/YARN-9753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved YARN-9753. --- Resolution: Invalid created issue in wrong project > Cache Pre-Priming > - > > Key: YARN-9753 > URL: https://issues.apache.org/jira/browse/YARN-9753 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Akash R Nilugal >Priority: Major > > Currently, we have an index server which basically helps in distributed > caching of the datamaps in a separate spark application. > The caching of the datamaps in index server will start once the query is > fired on the table for the first time, all the datamaps will be loaded > if the count(*) is fired and only required will be loaded for any filter > query. > Here the problem or the bottleneck is, until and unless the query is fired on > table, the caching won’t be done for the table datamaps. > So consider a scenario where we are just loading the data to table for whole > day and then next day we query, > so all the segments will start loading into cache. So first time the query > will be slow. > What if we load the datamaps into cache or preprime the cache without > waititng for any query on the table? > Yes, what if we load the cache after every load is done, what if we load the > cache for all the segments at once, > so that first time query need not do all this job, which makes it faster. > Here i have attached the design document for the pre-priming of cache into > index server. Please have a look at it -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9753) Cache Pre-Priming
Akash R Nilugal created YARN-9753: - Summary: Cache Pre-Priming Key: YARN-9753 URL: https://issues.apache.org/jira/browse/YARN-9753 Project: Hadoop YARN Issue Type: Bug Reporter: Akash R Nilugal Currently, we have an index server which basically helps in distributed caching of the datamaps in a separate spark application. The caching of the datamaps in index server will start once the query is fired on the table for the first time, all the datamaps will be loaded if the count(*) is fired and only required will be loaded for any filter query. Here the problem or the bottleneck is, until and unless the query is fired on table, the caching won’t be done for the table datamaps. So consider a scenario where we are just loading the data to table for whole day and then next day we query, so all the segments will start loading into cache. So first time the query will be slow. What if we load the datamaps into cache or preprime the cache without waititng for any query on the table? Yes, what if we load the cache after every load is done, what if we load the cache for all the segments at once, so that first time query need not do all this job, which makes it faster. Here i have attached the design document for the pre-priming of cache into index server. Please have a look at it -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9752) Add support for allocation id in SLS.
Abhishek Modi created YARN-9752: --- Summary: Add support for allocation id in SLS. Key: YARN-9752 URL: https://issues.apache.org/jira/browse/YARN-9752 Project: Hadoop YARN Issue Type: Sub-task Reporter: Abhishek Modi Assignee: Abhishek Modi -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9616) Shared Cache Manager Failed To Upload Unpacked Resources
[ https://issues.apache.org/jira/browse/YARN-9616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908644#comment-16908644 ] wangchengwei commented on YARN-9616: [~wzzdreamer] it seems I can't upload my patch... > Shared Cache Manager Failed To Upload Unpacked Resources > > > Key: YARN-9616 > URL: https://issues.apache.org/jira/browse/YARN-9616 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.8.3, 2.9.2, 2.8.5 >Reporter: zhenzhao wang >Assignee: zhenzhao wang >Priority: Major > Attachments: YARN-9616.001-2.9.patch > > > Yarn will unpack archives files and some other files based on the file type > and configuration. E.g. > If I started an MR job with -archive one.zip, then the one.zip will be > unpacked while download. Let's say there're file1 && file2 inside one.zip. > Then the files kept on local disk will be like > /disk3/yarn/local/filecache/352/one.zip/file1 > and/disk3/yarn/local/filecache/352/one.zip/file2. So the shared cache > uploader couldn't upload one.zip to shared cache as it was removed during > localization. The following errors will be thrown. > {code:java} > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.sharedcache.SharedCacheUploader: > Exception while uploading the file dict.zip > java.io.FileNotFoundException: File > /disk3/yarn/local/filecache/352/one.zip/one.zip does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:631) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:857) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:621) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442) > at > org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:146) > at > org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:347) > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:926) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.sharedcache.SharedCacheUploader.computeChecksum(SharedCacheUploader.java:257) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.sharedcache.SharedCacheUploader.call(SharedCacheUploader.java:128) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.sharedcache.SharedCacheUploader.call(SharedCacheUploader.java:55) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9751) Separate queue and app ordering policy capacity scheduler configs
Jonathan Hung created YARN-9751: --- Summary: Separate queue and app ordering policy capacity scheduler configs Key: YARN-9751 URL: https://issues.apache.org/jira/browse/YARN-9751 Project: Hadoop YARN Issue Type: Task Reporter: Jonathan Hung Right now it's not possible to specify distinct app and queue ordering policies since they share the same {{ordering-policy}} suffix. There's already a TODO in CapacitySchedulerConfiguration for this. This Jira intends to fix it. {noformat} // TODO (wangda): We need to better distinguish app ordering policy and queue // ordering policy's classname / configuration options, etc. And dedup code // if possible.{noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9749) TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk
[ https://issues.apache.org/jira/browse/YARN-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908384#comment-16908384 ] Peter Bacsko commented on YARN-9749: +1 non-binding > TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk > > > Key: YARN-9749 > URL: https://issues.apache.org/jira/browse/YARN-9749 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation, test >Reporter: Peter Bacsko >Assignee: Adam Antal >Priority: Major > Attachments: YARN-9749.001.patch > > > TestAppLogAggregatorImpl#testDFSQuotaExceeded currently fails on trunk. It > was most likely introduced by YARN-9676 (resetting HEAD to the previous > commit and then re-running the test passes). > {noformat} > [INFO] Running > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl > [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.781 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl > [ERROR] > testDFSQuotaExceeded(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl) > Time elapsed: 2.361 s <<< FAILURE! > java.lang.AssertionError: The set of paths for deletion are not the same as > expected: actual size: 0 vs expected size: 1 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.verifyFilesToDelete(TestAppLogAggregatorImpl.java:344) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.access$000(TestAppLogAggregatorImpl.java:82) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:330) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:319) > at > org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:39) > at > org.mockito.internal.handler.MockHandlerImpl.handle(MockHandlerImpl.java:96) > at > org.mockito.internal.handler.NullResultGuardian.handle(NullResultGuardian.java:29) > at > org.mockito.internal.handler.InvocationNotifierHandler.handle(InvocationNotifierHandler.java:35) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:61) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:49) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor$DispatcherDefaultingToRealMethod.interceptSuperCallable(MockMethodInterceptor.java:108) > at > org.apache.hadoop.yarn.server.nodemanager.DeletionService$MockitoMock$1879282050.delete(Unknown > Source) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregationPostCleanUp(AppLogAggregatorImpl.java:556) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:476) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.testDFSQuotaExceeded(TestAppLogAggregatorImpl.java:469) > ... > {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup
[ https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908372#comment-16908372 ] Hadoop QA commented on YARN-9100: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 27s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 29s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 0 new + 10 unchanged - 6 fixed = 10 total (was 16) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 25s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 50s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 75m 51s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9100 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12977719/YARN-9100-009.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux c8c7f02d8e60 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / c801f7a | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/24575/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24575/testReport/ | | Max. process+thread count | 465 (vs. ulim
[jira] [Commented] (YARN-9461) TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken fails with HTTP 400
[ https://issues.apache.org/jira/browse/YARN-9461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908337#comment-16908337 ] Hadoop QA commented on YARN-9461: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 37s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 6s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 26s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 86m 58s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}136m 41s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9461 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12965202/YARN-9461-001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 08a5de39895a 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 3f4f097 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24573/testReport/ | | Max. process+thread count | 873 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24573/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > TestRMWebServicesDelegationTokenAuthenti
[jira] [Commented] (YARN-9749) TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk
[ https://issues.apache.org/jira/browse/YARN-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908310#comment-16908310 ] Hadoop QA commented on YARN-9749: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 45s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 12s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 56s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 70m 46s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9749 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12977717/YARN-9749.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux fde964138990 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / c801f7a | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24574/testReport/ | | Max. process+thread count | 412 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24574/console | | Powered by | Apache Yetus 0.8.0
[jira] [Updated] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup
[ https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-9100: --- Attachment: YARN-9100-009.patch > Add tests for GpuResourceAllocator and do minor code cleanup > > > Key: YARN-9100 > URL: https://issues.apache.org/jira/browse/YARN-9100 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9100-004.patch, YARN-9100-005.patch, > YARN-9100-006.patch, YARN-9100-007.patch, YARN-9100-008.patch, > YARN-9100-009.patch, YARN-9100.001.patch, YARN-9100.002.patch, > YARN-9100.003.patch > > > Add tests for GpuResourceAllocator and do minor code cleanup > - Improved log and exception messages > - Added some new debug logs > - Some methods are named like *Copy, these are returning copies of internal > data structures. The word "copy" is just a noise in their name, so they have > been renamed. Additionally, the copied data structures modified to be > immutable. > - The waiting loop in method assignGpus were decoupled into a new class, > RetryCommand. > Some more words about the new class RetryCommand: > There are some similar waiting loops in the code in: AMRMClient, > AMRMClientAsync and even in GenericTestUtils (see waitFor method). > RetryCommand could be a future replacement of these duplicated code, as it > gives a solution to this waiting loop problem in a generic way. > The only downside of the usage of RetryCommand in GpuResourceAllocator > (startGpuAssignmentLoop) is the ugly exception handling part, but that's > solely because how Java deals with checked exceptions vs. lambdas. If there's > a cleaner way to solve the exception handling, I'm open for any suggestions. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9679) Regular code cleanup in TestResourcePluginManager
[ https://issues.apache.org/jira/browse/YARN-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908261#comment-16908261 ] Szilard Nemeth commented on YARN-9679: -- Thanks [~adam.antal], resolved it! > Regular code cleanup in TestResourcePluginManager > - > > Key: YARN-9679 > URL: https://issues.apache.org/jira/browse/YARN-9679 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Labels: newbie > Fix For: 3.3.0 > > > There are several things could be cleaned up in this class: > 1. stubResourcePluginmanager should be private. > 2. In tearDown, the result of dest.delete() should be checked > 3. In class CustomizedResourceHandler, there are several methods where > exceptions decalarations are unnecessary. > 4. Class MyMockNM should be renamed to some more meaningful name. > 5. There are some danling javadoc comments, for example: > {code:java} > /* >* Make sure ResourcePluginManager is initialized during NM start up. >*/ > {code} > 6. There are some exceptions unnecessarily declared on test methods but they > are never thrown, an example: > testLinuxContainerExecutorWithResourcePluginsEnabled > 7. Assert.assertTrue(false); expressions should be replaced with Assert.fail() > 8. A handful of usages of Mockito's spy method. This method is not preferred > so we should think about replacing it with mocks, somehow. > The rest can be figured out by whoever takes this jira :) -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9679) Regular code cleanup in TestResourcePluginManager
[ https://issues.apache.org/jira/browse/YARN-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908259#comment-16908259 ] Adam Antal commented on YARN-9679: -- Thank you for the commit [~snemeth]! Half of the class looks different / missing on branch-3.2. I see no gain in backporting it. The issue can be resolved. > Regular code cleanup in TestResourcePluginManager > - > > Key: YARN-9679 > URL: https://issues.apache.org/jira/browse/YARN-9679 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Labels: newbie > Fix For: 3.3.0 > > > There are several things could be cleaned up in this class: > 1. stubResourcePluginmanager should be private. > 2. In tearDown, the result of dest.delete() should be checked > 3. In class CustomizedResourceHandler, there are several methods where > exceptions decalarations are unnecessary. > 4. Class MyMockNM should be renamed to some more meaningful name. > 5. There are some danling javadoc comments, for example: > {code:java} > /* >* Make sure ResourcePluginManager is initialized during NM start up. >*/ > {code} > 6. There are some exceptions unnecessarily declared on test methods but they > are never thrown, an example: > testLinuxContainerExecutorWithResourcePluginsEnabled > 7. Assert.assertTrue(false); expressions should be replaced with Assert.fail() > 8. A handful of usages of Mockito's spy method. This method is not preferred > so we should think about replacing it with mocks, somehow. > The rest can be figured out by whoever takes this jira :) -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9749) TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk
[ https://issues.apache.org/jira/browse/YARN-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Antal updated YARN-9749: - Attachment: YARN-9749.001.patch > TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk > > > Key: YARN-9749 > URL: https://issues.apache.org/jira/browse/YARN-9749 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation, test >Reporter: Peter Bacsko >Assignee: Adam Antal >Priority: Major > Attachments: YARN-9749.001.patch > > > TestAppLogAggregatorImpl#testDFSQuotaExceeded currently fails on trunk. It > was most likely introduced by YARN-9676 (resetting HEAD to the previous > commit and then re-running the test passes). > {noformat} > [INFO] Running > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl > [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.781 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl > [ERROR] > testDFSQuotaExceeded(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl) > Time elapsed: 2.361 s <<< FAILURE! > java.lang.AssertionError: The set of paths for deletion are not the same as > expected: actual size: 0 vs expected size: 1 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.verifyFilesToDelete(TestAppLogAggregatorImpl.java:344) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.access$000(TestAppLogAggregatorImpl.java:82) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:330) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:319) > at > org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:39) > at > org.mockito.internal.handler.MockHandlerImpl.handle(MockHandlerImpl.java:96) > at > org.mockito.internal.handler.NullResultGuardian.handle(NullResultGuardian.java:29) > at > org.mockito.internal.handler.InvocationNotifierHandler.handle(InvocationNotifierHandler.java:35) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:61) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:49) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor$DispatcherDefaultingToRealMethod.interceptSuperCallable(MockMethodInterceptor.java:108) > at > org.apache.hadoop.yarn.server.nodemanager.DeletionService$MockitoMock$1879282050.delete(Unknown > Source) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregationPostCleanUp(AppLogAggregatorImpl.java:556) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:476) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.testDFSQuotaExceeded(TestAppLogAggregatorImpl.java:469) > ... > {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9749) TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk
[ https://issues.apache.org/jira/browse/YARN-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908227#comment-16908227 ] Adam Antal commented on YARN-9749: -- Uploaded patch v1 with the straightforward fix. I revisited the patch looking for other of this nasty NPEs but the other parts are fine. > TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk > > > Key: YARN-9749 > URL: https://issues.apache.org/jira/browse/YARN-9749 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation, test >Reporter: Peter Bacsko >Assignee: Adam Antal >Priority: Major > Attachments: YARN-9749.001.patch > > > TestAppLogAggregatorImpl#testDFSQuotaExceeded currently fails on trunk. It > was most likely introduced by YARN-9676 (resetting HEAD to the previous > commit and then re-running the test passes). > {noformat} > [INFO] Running > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl > [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.781 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl > [ERROR] > testDFSQuotaExceeded(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl) > Time elapsed: 2.361 s <<< FAILURE! > java.lang.AssertionError: The set of paths for deletion are not the same as > expected: actual size: 0 vs expected size: 1 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.verifyFilesToDelete(TestAppLogAggregatorImpl.java:344) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.access$000(TestAppLogAggregatorImpl.java:82) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:330) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:319) > at > org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:39) > at > org.mockito.internal.handler.MockHandlerImpl.handle(MockHandlerImpl.java:96) > at > org.mockito.internal.handler.NullResultGuardian.handle(NullResultGuardian.java:29) > at > org.mockito.internal.handler.InvocationNotifierHandler.handle(InvocationNotifierHandler.java:35) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:61) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:49) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor$DispatcherDefaultingToRealMethod.interceptSuperCallable(MockMethodInterceptor.java:108) > at > org.apache.hadoop.yarn.server.nodemanager.DeletionService$MockitoMock$1879282050.delete(Unknown > Source) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregationPostCleanUp(AppLogAggregatorImpl.java:556) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:476) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.testDFSQuotaExceeded(TestAppLogAggregatorImpl.java:469) > ... > {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9749) TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk
[ https://issues.apache.org/jira/browse/YARN-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908222#comment-16908222 ] Adam Antal commented on YARN-9749: -- Sooo my patch actually introduced a bug in the code. When diagnosticMessage.isEmpty() is called (introduced by YARN-9676) it turns out that the message is null. By default it has been set to "" (empty String), thus I didn't recognise it can be an issue. The following line causes the NPE: {noformat} } catch (Exception e) { diagnosticMessage = e.getMessage(); ... } {noformat} We cannot assume that an exception has always proper message text. I'll upload a fix soon. > TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk > > > Key: YARN-9749 > URL: https://issues.apache.org/jira/browse/YARN-9749 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation, test >Reporter: Peter Bacsko >Assignee: Adam Antal >Priority: Major > > TestAppLogAggregatorImpl#testDFSQuotaExceeded currently fails on trunk. It > was most likely introduced by YARN-9676 (resetting HEAD to the previous > commit and then re-running the test passes). > {noformat} > [INFO] Running > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl > [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.781 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl > [ERROR] > testDFSQuotaExceeded(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl) > Time elapsed: 2.361 s <<< FAILURE! > java.lang.AssertionError: The set of paths for deletion are not the same as > expected: actual size: 0 vs expected size: 1 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.verifyFilesToDelete(TestAppLogAggregatorImpl.java:344) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.access$000(TestAppLogAggregatorImpl.java:82) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:330) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:319) > at > org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:39) > at > org.mockito.internal.handler.MockHandlerImpl.handle(MockHandlerImpl.java:96) > at > org.mockito.internal.handler.NullResultGuardian.handle(NullResultGuardian.java:29) > at > org.mockito.internal.handler.InvocationNotifierHandler.handle(InvocationNotifierHandler.java:35) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:61) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:49) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor$DispatcherDefaultingToRealMethod.interceptSuperCallable(MockMethodInterceptor.java:108) > at > org.apache.hadoop.yarn.server.nodemanager.DeletionService$MockitoMock$1879282050.delete(Unknown > Source) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregationPostCleanUp(AppLogAggregatorImpl.java:556) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:476) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.testDFSQuotaExceeded(TestAppLogAggregatorImpl.java:469) > ... > {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9488) Skip YARNFeatureNotEnabledException from ClientRMService
[ https://issues.apache.org/jira/browse/YARN-9488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908208#comment-16908208 ] Hudson commented on YARN-9488: -- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17130 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17130/]) YARN-9488. Skip YARNFeatureNotEnabledException from ClientRMService. (snemeth: rev 1845a83cec6563482523d8c34b38c4e36c0aa9df) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java > Skip YARNFeatureNotEnabledException from ClientRMService > > > Key: YARN-9488 > URL: https://issues.apache.org/jira/browse/YARN-9488 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: YARN-9488-001.patch, YARN-9488-002.patch, > YARN-9488-002.patch > > > RM logs are accumulated with YARNFeatureNotEnabledException when running > Distributed Shell jobs while {{ClientRMService#getResourceProfiles}} > {code} > 2019-04-16 07:10:47,699 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 0 on 8050, call Call#5 Retry#0 > org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getResourceProfiles > from 172.26.81.91:41198 > org.apache.hadoop.yarn.exceptions.YARNFeatureNotEnabledException: Resource > profile is not enabled, please enable resource profile feature before using > its functions. (by setting yarn.resourcemanager.resource-profiles.enabled to > true) > at > org.apache.hadoop.yarn.server.resourcemanager.resource.ResourceProfilesManagerImpl.checkAndThrowExceptionWhenFeatureDisabled(ResourceProfilesManagerImpl.java:191) > at > org.apache.hadoop.yarn.server.resourcemanager.resource.ResourceProfilesManagerImpl.getResourceProfiles(ResourceProfilesManagerImpl.java:214) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1833) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:670) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:665) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9679) Regular code cleanup in TestResourcePluginManager
[ https://issues.apache.org/jira/browse/YARN-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908209#comment-16908209 ] Hudson commented on YARN-9679: -- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17130 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17130/]) YARN-9679. Regular code cleanup in TestResourcePluginManager (#1122) (954799+szilard-nemeth: rev 22c4f38c4b005a70c9b95d8aaa350763aaec5c5e) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/resourceplugin/TestResourcePluginManager.java > Regular code cleanup in TestResourcePluginManager > - > > Key: YARN-9679 > URL: https://issues.apache.org/jira/browse/YARN-9679 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Labels: newbie > Fix For: 3.3.0 > > > There are several things could be cleaned up in this class: > 1. stubResourcePluginmanager should be private. > 2. In tearDown, the result of dest.delete() should be checked > 3. In class CustomizedResourceHandler, there are several methods where > exceptions decalarations are unnecessary. > 4. Class MyMockNM should be renamed to some more meaningful name. > 5. There are some danling javadoc comments, for example: > {code:java} > /* >* Make sure ResourcePluginManager is initialized during NM start up. >*/ > {code} > 6. There are some exceptions unnecessarily declared on test methods but they > are never thrown, an example: > testLinuxContainerExecutorWithResourcePluginsEnabled > 7. Assert.assertTrue(false); expressions should be replaced with Assert.fail() > 8. A handful of usages of Mockito's spy method. This method is not preferred > so we should think about replacing it with mocks, somehow. > The rest can be figured out by whoever takes this jira :) -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9679) Regular code cleanup in TestResourcePluginManager
[ https://issues.apache.org/jira/browse/YARN-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908198#comment-16908198 ] Szilard Nemeth commented on YARN-9679: -- Hi [~adam.antal]! As told on github, I merged your patch to trunk. Could you please provide branch-3.2 / branch-3.1 patches as well? I tried to cherry-pick the trunk patch to 3.2 but it had many conflicts. Thanks! > Regular code cleanup in TestResourcePluginManager > - > > Key: YARN-9679 > URL: https://issues.apache.org/jira/browse/YARN-9679 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Labels: newbie > Fix For: 3.3.0 > > > There are several things could be cleaned up in this class: > 1. stubResourcePluginmanager should be private. > 2. In tearDown, the result of dest.delete() should be checked > 3. In class CustomizedResourceHandler, there are several methods where > exceptions decalarations are unnecessary. > 4. Class MyMockNM should be renamed to some more meaningful name. > 5. There are some danling javadoc comments, for example: > {code:java} > /* >* Make sure ResourcePluginManager is initialized during NM start up. >*/ > {code} > 6. There are some exceptions unnecessarily declared on test methods but they > are never thrown, an example: > testLinuxContainerExecutorWithResourcePluginsEnabled > 7. Assert.assertTrue(false); expressions should be replaced with Assert.fail() > 8. A handful of usages of Mockito's spy method. This method is not preferred > so we should think about replacing it with mocks, somehow. > The rest can be figured out by whoever takes this jira :) -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9679) Regular code cleanup in TestResourcePluginManager
[ https://issues.apache.org/jira/browse/YARN-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-9679: - Fix Version/s: 3.3.0 > Regular code cleanup in TestResourcePluginManager > - > > Key: YARN-9679 > URL: https://issues.apache.org/jira/browse/YARN-9679 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Labels: newbie > Fix For: 3.3.0 > > > There are several things could be cleaned up in this class: > 1. stubResourcePluginmanager should be private. > 2. In tearDown, the result of dest.delete() should be checked > 3. In class CustomizedResourceHandler, there are several methods where > exceptions decalarations are unnecessary. > 4. Class MyMockNM should be renamed to some more meaningful name. > 5. There are some danling javadoc comments, for example: > {code:java} > /* >* Make sure ResourcePluginManager is initialized during NM start up. >*/ > {code} > 6. There are some exceptions unnecessarily declared on test methods but they > are never thrown, an example: > testLinuxContainerExecutorWithResourcePluginsEnabled > 7. Assert.assertTrue(false); expressions should be replaced with Assert.fail() > 8. A handful of usages of Mockito's spy method. This method is not preferred > so we should think about replacing it with mocks, somehow. > The rest can be figured out by whoever takes this jira :) -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9750) TestAppLogAggregatorImpl.verifyFilesToDelete fails
Prabhu Joseph created YARN-9750: --- Summary: TestAppLogAggregatorImpl.verifyFilesToDelete fails Key: YARN-9750 URL: https://issues.apache.org/jira/browse/YARN-9750 Project: Hadoop YARN Issue Type: Bug Components: log-aggregation, test Affects Versions: 3.3.0 Reporter: Prabhu Joseph Assignee: Prabhu Joseph *TestAppLogAggregatorImpl.verifyFilesToDelete fails* {code} [ERROR] testDFSQuotaExceeded(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl) Time elapsed: 2.252 s <<< FAILURE! java.lang.AssertionError: The set of paths for deletion are not the same as expected: actual size: 0 vs expected size: 1 at org.junit.Assert.fail(Assert.java:88) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.verifyFilesToDelete(TestAppLogAggregatorImpl.java:344) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.access$000(TestAppLogAggregatorImpl.java:82) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:330) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:319) at org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:39) at org.mockito.internal.handler.MockHandlerImpl.handle(MockHandlerImpl.java:96) at org.mockito.internal.handler.NullResultGuardian.handle(NullResultGuardian.java:29) at org.mockito.internal.handler.InvocationNotifierHandler.handle(InvocationNotifierHandler.java:35) at org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:61) at org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:49) at org.mockito.internal.creation.bytebuddy.MockMethodInterceptor$DispatcherDefaultingToRealMethod.interceptSuperCallable(MockMethodInterceptor.java:108) at org.apache.hadoop.yarn.server.nodemanager.DeletionService$MockitoMock$1136724178.delete(Unknown Source) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregationPostCleanUp(AppLogAggregatorImpl.java:556) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:476) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.testDFSQuotaExceeded(TestAppLogAggregatorImpl.java:469) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) at org.apache.maven.surefire.boo
[jira] [Assigned] (YARN-9748) Allow capacity-scheduler configuration on HDFS
[ https://issues.apache.org/jira/browse/YARN-9748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph reassigned YARN-9748: --- Assignee: Prabhu Joseph > Allow capacity-scheduler configuration on HDFS > -- > > Key: YARN-9748 > URL: https://issues.apache.org/jira/browse/YARN-9748 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler, capacityscheduler >Affects Versions: 3.1.2 >Reporter: zhoukang >Assignee: Prabhu Joseph >Priority: Major > > Improvement: > Support auto reload from hdfs -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9748) Allow capacity-scheduler configuration on HDFS
[ https://issues.apache.org/jira/browse/YARN-9748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908188#comment-16908188 ] Prabhu Joseph commented on YARN-9748: - Thanks [~cane]. Please feel free to reassign if you want to share the same code here. > Allow capacity-scheduler configuration on HDFS > -- > > Key: YARN-9748 > URL: https://issues.apache.org/jira/browse/YARN-9748 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler, capacityscheduler >Affects Versions: 3.1.2 >Reporter: zhoukang >Priority: Major > > Improvement: > Support auto reload from hdfs -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9749) TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk
[ https://issues.apache.org/jira/browse/YARN-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908186#comment-16908186 ] Adam Antal commented on YARN-9749: -- Thanks for catching this [~pbacsko]! Will look into this shortly. > TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk > > > Key: YARN-9749 > URL: https://issues.apache.org/jira/browse/YARN-9749 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation, test >Reporter: Peter Bacsko >Assignee: Adam Antal >Priority: Major > > TestAppLogAggregatorImpl#testDFSQuotaExceeded currently fails on trunk. It > was most likely introduced by YARN-9676 (resetting HEAD to the previous > commit and then re-running the test passes). > {noformat} > [INFO] Running > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl > [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.781 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl > [ERROR] > testDFSQuotaExceeded(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl) > Time elapsed: 2.361 s <<< FAILURE! > java.lang.AssertionError: The set of paths for deletion are not the same as > expected: actual size: 0 vs expected size: 1 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.verifyFilesToDelete(TestAppLogAggregatorImpl.java:344) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.access$000(TestAppLogAggregatorImpl.java:82) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:330) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:319) > at > org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:39) > at > org.mockito.internal.handler.MockHandlerImpl.handle(MockHandlerImpl.java:96) > at > org.mockito.internal.handler.NullResultGuardian.handle(NullResultGuardian.java:29) > at > org.mockito.internal.handler.InvocationNotifierHandler.handle(InvocationNotifierHandler.java:35) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:61) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:49) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor$DispatcherDefaultingToRealMethod.interceptSuperCallable(MockMethodInterceptor.java:108) > at > org.apache.hadoop.yarn.server.nodemanager.DeletionService$MockitoMock$1879282050.delete(Unknown > Source) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregationPostCleanUp(AppLogAggregatorImpl.java:556) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:476) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.testDFSQuotaExceeded(TestAppLogAggregatorImpl.java:469) > ... > {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9488) Skip YARNFeatureNotEnabledException from ClientRMService
[ https://issues.apache.org/jira/browse/YARN-9488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908185#comment-16908185 ] Prabhu Joseph commented on YARN-9488: - Thanks [~snemeth]. > Skip YARNFeatureNotEnabledException from ClientRMService > > > Key: YARN-9488 > URL: https://issues.apache.org/jira/browse/YARN-9488 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: YARN-9488-001.patch, YARN-9488-002.patch, > YARN-9488-002.patch > > > RM logs are accumulated with YARNFeatureNotEnabledException when running > Distributed Shell jobs while {{ClientRMService#getResourceProfiles}} > {code} > 2019-04-16 07:10:47,699 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 0 on 8050, call Call#5 Retry#0 > org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getResourceProfiles > from 172.26.81.91:41198 > org.apache.hadoop.yarn.exceptions.YARNFeatureNotEnabledException: Resource > profile is not enabled, please enable resource profile feature before using > its functions. (by setting yarn.resourcemanager.resource-profiles.enabled to > true) > at > org.apache.hadoop.yarn.server.resourcemanager.resource.ResourceProfilesManagerImpl.checkAndThrowExceptionWhenFeatureDisabled(ResourceProfilesManagerImpl.java:191) > at > org.apache.hadoop.yarn.server.resourcemanager.resource.ResourceProfilesManagerImpl.getResourceProfiles(ResourceProfilesManagerImpl.java:214) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1833) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:670) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:665) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9743) [JDK11] TestTimelineWebServices.testContextFactory fails
[ https://issues.apache.org/jira/browse/YARN-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908183#comment-16908183 ] Prabhu Joseph commented on YARN-9743: - [~adam.antal] Looks jdk11 has removed classes from jaxbl-impl which has "com.sun.xml.internal.bind.v2.ContextFactory". I think we need to add jaxbl-impl library in the classpath to fix the same. {code:java} https://issues.openbravo.com/view.php?id=38900 http://openjdk.java.net/jeps/320 {code} > [JDK11] TestTimelineWebServices.testContextFactory fails > > > Key: YARN-9743 > URL: https://issues.apache.org/jira/browse/YARN-9743 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineservice >Affects Versions: 3.2.0 >Reporter: Adam Antal >Priority: Major > > Tested on OpenJDK 11.0.2 on a Mac. > Stack trace: > {noformat} > [ERROR] Tests run: 29, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: > 36.016 s <<< FAILURE! - in > org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices > [ERROR] > testContextFactory(org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices) > Time elapsed: 1.031 s <<< ERROR! > java.lang.ClassNotFoundException: com.sun.xml.internal.bind.v2.ContextFactory > at > java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583) > at > java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) > at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) > at java.base/java.lang.Class.forName0(Native Method) > at java.base/java.lang.Class.forName(Class.java:315) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.ContextFactory.newContext(ContextFactory.java:85) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.ContextFactory.createContext(ContextFactory.java:112) > at > org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices.testContextFactory(TestTimelineWebServices.java:1039) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9488) Skip YARNFeatureNotEnabledException from ClientRMService
[ https://issues.apache.org/jira/browse/YARN-9488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908181#comment-16908181 ] Szilard Nemeth commented on YARN-9488: -- Thanks [~Prabhu Joseph] for this patch! Commited it to trunk, branch-3.2 and branch-3.1. > Skip YARNFeatureNotEnabledException from ClientRMService > > > Key: YARN-9488 > URL: https://issues.apache.org/jira/browse/YARN-9488 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Attachments: YARN-9488-001.patch, YARN-9488-002.patch, > YARN-9488-002.patch > > > RM logs are accumulated with YARNFeatureNotEnabledException when running > Distributed Shell jobs while {{ClientRMService#getResourceProfiles}} > {code} > 2019-04-16 07:10:47,699 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 0 on 8050, call Call#5 Retry#0 > org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getResourceProfiles > from 172.26.81.91:41198 > org.apache.hadoop.yarn.exceptions.YARNFeatureNotEnabledException: Resource > profile is not enabled, please enable resource profile feature before using > its functions. (by setting yarn.resourcemanager.resource-profiles.enabled to > true) > at > org.apache.hadoop.yarn.server.resourcemanager.resource.ResourceProfilesManagerImpl.checkAndThrowExceptionWhenFeatureDisabled(ResourceProfilesManagerImpl.java:191) > at > org.apache.hadoop.yarn.server.resourcemanager.resource.ResourceProfilesManagerImpl.getResourceProfiles(ResourceProfilesManagerImpl.java:214) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1833) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:670) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:665) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9488) Skip YARNFeatureNotEnabledException from ClientRMService
[ https://issues.apache.org/jira/browse/YARN-9488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908167#comment-16908167 ] Szilard Nemeth commented on YARN-9488: -- Hi [~Prabhu Joseph]! +1, committing shortly > Skip YARNFeatureNotEnabledException from ClientRMService > > > Key: YARN-9488 > URL: https://issues.apache.org/jira/browse/YARN-9488 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Attachments: YARN-9488-001.patch, YARN-9488-002.patch, > YARN-9488-002.patch > > > RM logs are accumulated with YARNFeatureNotEnabledException when running > Distributed Shell jobs while {{ClientRMService#getResourceProfiles}} > {code} > 2019-04-16 07:10:47,699 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 0 on 8050, call Call#5 Retry#0 > org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getResourceProfiles > from 172.26.81.91:41198 > org.apache.hadoop.yarn.exceptions.YARNFeatureNotEnabledException: Resource > profile is not enabled, please enable resource profile feature before using > its functions. (by setting yarn.resourcemanager.resource-profiles.enabled to > true) > at > org.apache.hadoop.yarn.server.resourcemanager.resource.ResourceProfilesManagerImpl.checkAndThrowExceptionWhenFeatureDisabled(ResourceProfilesManagerImpl.java:191) > at > org.apache.hadoop.yarn.server.resourcemanager.resource.ResourceProfilesManagerImpl.getResourceProfiles(ResourceProfilesManagerImpl.java:214) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1833) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:670) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:665) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8148) Update decimal values for queue capacities shown on queue status cli
[ https://issues.apache.org/jira/browse/YARN-8148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908163#comment-16908163 ] Szilard Nemeth commented on YARN-8148: -- Hi [~sunilg], [~wangda]! I wanted to commit this then something popped into my mind: Should we worry about backward compatibility? I mean, the output of the Admin CLI will change, can we commit to trunk / 3.2 and 3.1? Thanks! > Update decimal values for queue capacities shown on queue status cli > > > Key: YARN-8148 > URL: https://issues.apache.org/jira/browse/YARN-8148 > Project: Hadoop YARN > Issue Type: Bug > Components: client >Affects Versions: 3.0.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-8148-002.patch, YARN-8148.1.patch > > > Capacities are shown with two decimal values in RM UI as part of YARN-6182. > The queue status cli are still showing one decimal value. > {code} > [root@bigdata3 yarn]# yarn queue -status default > Queue Information : > Queue Name : default > State : RUNNING > Capacity : 69.9% > Current Capacity : .0% > Maximum Capacity : 70.0% > Default Node Label expression : > Accessible Node Labels : * > Preemption : enabled > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9488) Skip YARNFeatureNotEnabledException from ClientRMService
[ https://issues.apache.org/jira/browse/YARN-9488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-9488: - Description: RM logs are accumulated with YARNFeatureNotEnabledException when running Distributed Shell jobs while {{ClientRMService#getResourceProfiles}} {code} 2019-04-16 07:10:47,699 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 8050, call Call#5 Retry#0 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getResourceProfiles from 172.26.81.91:41198 org.apache.hadoop.yarn.exceptions.YARNFeatureNotEnabledException: Resource profile is not enabled, please enable resource profile feature before using its functions. (by setting yarn.resourcemanager.resource-profiles.enabled to true) at org.apache.hadoop.yarn.server.resourcemanager.resource.ResourceProfilesManagerImpl.checkAndThrowExceptionWhenFeatureDisabled(ResourceProfilesManagerImpl.java:191) at org.apache.hadoop.yarn.server.resourcemanager.resource.ResourceProfilesManagerImpl.getResourceProfiles(ResourceProfilesManagerImpl.java:214) at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1833) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:670) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:665) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) {code} was: RM logs are accumulated with YARNFeatureNotEnabledException when running DIstributed Shell jobs while {{ClientRMService#getResourceProfiles}} {code} 2019-04-16 07:10:47,699 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 8050, call Call#5 Retry#0 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getResourceProfiles from 172.26.81.91:41198 org.apache.hadoop.yarn.exceptions.YARNFeatureNotEnabledException: Resource profile is not enabled, please enable resource profile feature before using its functions. (by setting yarn.resourcemanager.resource-profiles.enabled to true) at org.apache.hadoop.yarn.server.resourcemanager.resource.ResourceProfilesManagerImpl.checkAndThrowExceptionWhenFeatureDisabled(ResourceProfilesManagerImpl.java:191) at org.apache.hadoop.yarn.server.resourcemanager.resource.ResourceProfilesManagerImpl.getResourceProfiles(ResourceProfilesManagerImpl.java:214) at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1833) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:670) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:665) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) {code} > Skip YARNFeatureNotEnabledException from ClientRMService > > > Key: YARN-9488 > URL: https://issues.apache.org/jira/browse/YARN-9488 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Attachments: YARN-9488-001.patch, YARN-9488-002.patch, > YARN-9488-002.patch > > > RM logs are accumulated with YARNFeatureNotEnabledException when running > Distributed Shell jobs while {{ClientRMService#getResourceProfiles}} > {code} > 2019-04-16 07:10:47,699 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 0 on 80
[jira] [Commented] (YARN-9461) TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken fails with HTTP 400
[ https://issues.apache.org/jira/browse/YARN-9461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908154#comment-16908154 ] Szilard Nemeth commented on YARN-9461: -- Kicked off build again, in order to have a fresh Jenkins result. > TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken > fails with HTTP 400 > --- > > Key: YARN-9461 > URL: https://issues.apache.org/jira/browse/YARN-9461 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, test >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Minor > Attachments: YARN-9461-001.patch > > > The test > {{TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken}} > sometimes fails with the following error: > {noformat} > java.io.IOException: Server returned HTTP response code: 400 for URL: > http://localhost:8088/ws/v1/cluster/delegation-token > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.cancelDelegationToken(TestRMWebServicesDelegationTokenAuthentication.java:462) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken(TestRMWebServicesDelegationTokenAuthentication.java:283) > {noformat} > The problem is that for whatever reason, Jetty seems to execute the token > cancellation REST call twice. First we get HTTP 200 OK, but the second > request fails with HTTP 400 Bad Request. > The {{MockRM}} instance is static. Something could be a problem in this class > and it turned out that using separate {{MockRM}} instances solves the > flakiness. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9749) TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk
Peter Bacsko created YARN-9749: -- Summary: TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk Key: YARN-9749 URL: https://issues.apache.org/jira/browse/YARN-9749 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Peter Bacsko Assignee: Adam Antal TestAppLogAggregatorImpl#testDFSQuotaExceeded currently fails on trunk. It was most likely introduced by YARN-9676 (resetting HEAD to the previous commit and then re-running the test passes). {noformat} [INFO] Running org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.781 s <<< FAILURE! - in org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl [ERROR] testDFSQuotaExceeded(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl) Time elapsed: 2.361 s <<< FAILURE! java.lang.AssertionError: The set of paths for deletion are not the same as expected: actual size: 0 vs expected size: 1 at org.junit.Assert.fail(Assert.java:88) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.verifyFilesToDelete(TestAppLogAggregatorImpl.java:344) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.access$000(TestAppLogAggregatorImpl.java:82) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:330) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:319) at org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:39) at org.mockito.internal.handler.MockHandlerImpl.handle(MockHandlerImpl.java:96) at org.mockito.internal.handler.NullResultGuardian.handle(NullResultGuardian.java:29) at org.mockito.internal.handler.InvocationNotifierHandler.handle(InvocationNotifierHandler.java:35) at org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:61) at org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:49) at org.mockito.internal.creation.bytebuddy.MockMethodInterceptor$DispatcherDefaultingToRealMethod.interceptSuperCallable(MockMethodInterceptor.java:108) at org.apache.hadoop.yarn.server.nodemanager.DeletionService$MockitoMock$1879282050.delete(Unknown Source) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregationPostCleanUp(AppLogAggregatorImpl.java:556) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:476) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.testDFSQuotaExceeded(TestAppLogAggregatorImpl.java:469) ... {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9217) Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing
[ https://issues.apache.org/jira/browse/YARN-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908150#comment-16908150 ] Hadoop QA commented on YARN-9217: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 38s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 5m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 24s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 10s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 0 new + 224 unchanged - 2 fixed = 224 total (was 226) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 18s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 49s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 39s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m 2s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}115m 48s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 | | JIRA Issue | YARN-9217 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12977703/YARN-9217.011.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedcli
[jira] [Updated] (YARN-9749) TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk
[ https://issues.apache.org/jira/browse/YARN-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-9749: --- Component/s: (was: yarn) test log-aggregation > TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk > > > Key: YARN-9749 > URL: https://issues.apache.org/jira/browse/YARN-9749 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation, test >Reporter: Peter Bacsko >Assignee: Adam Antal >Priority: Major > > TestAppLogAggregatorImpl#testDFSQuotaExceeded currently fails on trunk. It > was most likely introduced by YARN-9676 (resetting HEAD to the previous > commit and then re-running the test passes). > {noformat} > [INFO] Running > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl > [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.781 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl > [ERROR] > testDFSQuotaExceeded(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl) > Time elapsed: 2.361 s <<< FAILURE! > java.lang.AssertionError: The set of paths for deletion are not the same as > expected: actual size: 0 vs expected size: 1 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.verifyFilesToDelete(TestAppLogAggregatorImpl.java:344) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.access$000(TestAppLogAggregatorImpl.java:82) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:330) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:319) > at > org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:39) > at > org.mockito.internal.handler.MockHandlerImpl.handle(MockHandlerImpl.java:96) > at > org.mockito.internal.handler.NullResultGuardian.handle(NullResultGuardian.java:29) > at > org.mockito.internal.handler.InvocationNotifierHandler.handle(InvocationNotifierHandler.java:35) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:61) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:49) > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor$DispatcherDefaultingToRealMethod.interceptSuperCallable(MockMethodInterceptor.java:108) > at > org.apache.hadoop.yarn.server.nodemanager.DeletionService$MockitoMock$1879282050.delete(Unknown > Source) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregationPostCleanUp(AppLogAggregatorImpl.java:556) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:476) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.testDFSQuotaExceeded(TestAppLogAggregatorImpl.java:469) > ... > {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9743) [JDK11] TestTimelineWebServices.testContextFactory fails
[ https://issues.apache.org/jira/browse/YARN-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908100#comment-16908100 ] Adam Antal commented on YARN-9743: -- build with openjdk 11.04, jdk11 target, run on jdk11 using the following command: {noformat} mvn clean test -Dit.test=None -Djavac.version=11 -Dtest=org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices#testContextFactory {noformat} Still produces this error with the same stack trace. [~Prabhu Joseph] can you take a look at this issue? Looks like that piece of code has been touched by you lately - maybe you know what the issue is. I searched around in the internet, but I found no solution. It does seem like other projects targetinig Java11 are also facing this problem. > [JDK11] TestTimelineWebServices.testContextFactory fails > > > Key: YARN-9743 > URL: https://issues.apache.org/jira/browse/YARN-9743 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineservice >Affects Versions: 3.2.0 >Reporter: Adam Antal >Priority: Major > > Tested on OpenJDK 11.0.2 on a Mac. > Stack trace: > {noformat} > [ERROR] Tests run: 29, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: > 36.016 s <<< FAILURE! - in > org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices > [ERROR] > testContextFactory(org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices) > Time elapsed: 1.031 s <<< ERROR! > java.lang.ClassNotFoundException: com.sun.xml.internal.bind.v2.ContextFactory > at > java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583) > at > java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) > at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) > at java.base/java.lang.Class.forName0(Native Method) > at java.base/java.lang.Class.forName(Class.java:315) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.ContextFactory.newContext(ContextFactory.java:85) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.ContextFactory.createContext(ContextFactory.java:112) > at > org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices.testContextFactory(TestTimelineWebServices.java:1039) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {no
[jira] [Commented] (YARN-8586) Extract log aggregation related fields and methods from RMAppImpl
[ https://issues.apache.org/jira/browse/YARN-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908096#comment-16908096 ] Hadoop QA commented on YARN-8586: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}127m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} branch-3.1 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 36s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 9s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} branch-3.1 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 107 unchanged - 10 fixed = 107 total (was 117) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 52s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 69m 45s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}257m 59s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:080e9d0f9b3 | | JIRA Issue | YARN-8586 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12977693/YARN-8586-branch-3.1.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 9b33067ec310 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-3.1 / 96ea7f0 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24569/testReport/ | | Max. process+thread count | 833 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U:
[jira] [Commented] (YARN-9480) createAppDir() in LogAggregationService shouldn't block dispatcher thread of ContainerManagerImpl
[ https://issues.apache.org/jira/browse/YARN-9480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908094#comment-16908094 ] Hadoop QA commented on YARN-9480: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 46s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 25s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 1 new + 188 unchanged - 1 fixed = 189 total (was 189) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 23s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m 11s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 76m 6s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 | | JIRA Issue | YARN-9480 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12977701/YARN-9480.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 60c9ecbc2c78 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 08:28:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 3468164 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/24571/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/24571/artifact/out/patch-unit-hadoop-yarn-project_ha
[jira] [Updated] (YARN-9217) Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing
[ https://issues.apache.org/jira/browse/YARN-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-9217: --- Attachment: YARN-9217.011.patch > Nodemanager will fail to start if GPU is misconfigured on the node or GPU > drivers missing > - > > Key: YARN-9217 > URL: https://issues.apache.org/jira/browse/YARN-9217 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9217.001.patch, YARN-9217.002.patch, > YARN-9217.003.patch, YARN-9217.004.patch, YARN-9217.005.patch, > YARN-9217.006.patch, YARN-9217.007.patch, YARN-9217.008.patch, > YARN-9217.009.patch, YARN-9217.010.patch, YARN-9217.011.patch > > > Nodemanager will not start > 1. If Autodiscovery is enabled: > * If nvidia-smi path is misconfigured or the file does not exist. > * There is 0 GPU found > * If the file exists but it is not pointing to an nvidia-smi > * if the binary is ok but there is an IOException > 2. If the manually configured GPU devices are misconfigured > * Any index:minor number format failure will cause a problem > * 0 configured device will cause a problem > * NumberFormatException is not handled > It would be a better option to add warnings about the configuration, set 0 > available GPUs and let the node work and run non-gpu jobs. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9217) Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing
[ https://issues.apache.org/jira/browse/YARN-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-9217: --- Attachment: (was: YARN-9217.011.patch) > Nodemanager will fail to start if GPU is misconfigured on the node or GPU > drivers missing > - > > Key: YARN-9217 > URL: https://issues.apache.org/jira/browse/YARN-9217 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9217.001.patch, YARN-9217.002.patch, > YARN-9217.003.patch, YARN-9217.004.patch, YARN-9217.005.patch, > YARN-9217.006.patch, YARN-9217.007.patch, YARN-9217.008.patch, > YARN-9217.009.patch, YARN-9217.010.patch, YARN-9217.011.patch > > > Nodemanager will not start > 1. If Autodiscovery is enabled: > * If nvidia-smi path is misconfigured or the file does not exist. > * There is 0 GPU found > * If the file exists but it is not pointing to an nvidia-smi > * if the binary is ok but there is an IOException > 2. If the manually configured GPU devices are misconfigured > * Any index:minor number format failure will cause a problem > * 0 configured device will cause a problem > * NumberFormatException is not handled > It would be a better option to add warnings about the configuration, set 0 > available GPUs and let the node work and run non-gpu jobs. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9217) Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing
[ https://issues.apache.org/jira/browse/YARN-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-9217: --- Attachment: YARN-9217.011.patch > Nodemanager will fail to start if GPU is misconfigured on the node or GPU > drivers missing > - > > Key: YARN-9217 > URL: https://issues.apache.org/jira/browse/YARN-9217 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9217.001.patch, YARN-9217.002.patch, > YARN-9217.003.patch, YARN-9217.004.patch, YARN-9217.005.patch, > YARN-9217.006.patch, YARN-9217.007.patch, YARN-9217.008.patch, > YARN-9217.009.patch, YARN-9217.010.patch, YARN-9217.011.patch > > > Nodemanager will not start > 1. If Autodiscovery is enabled: > * If nvidia-smi path is misconfigured or the file does not exist. > * There is 0 GPU found > * If the file exists but it is not pointing to an nvidia-smi > * if the binary is ok but there is an IOException > 2. If the manually configured GPU devices are misconfigured > * Any index:minor number format failure will cause a problem > * 0 configured device will cause a problem > * NumberFormatException is not handled > It would be a better option to add warnings about the configuration, set 0 > available GPUs and let the node work and run non-gpu jobs. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9676) Add DEBUG and TRACE level messages to AppLogAggregatorImpl and connected classes
[ https://issues.apache.org/jira/browse/YARN-9676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908049#comment-16908049 ] Szilard Nemeth commented on YARN-9676: -- Thanks [~adam.antal]! > Add DEBUG and TRACE level messages to AppLogAggregatorImpl and connected > classes > > > Key: YARN-9676 > URL: https://issues.apache.org/jira/browse/YARN-9676 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Adam Antal >Assignee: Adam Antal >Priority: Major > Fix For: 3.3.0, 3.2.1 > > > During the development of the last items of YARN-6875, it was typically > difficult to extract information about the internal state of some log > aggregation related classes (e.g. {{AppLogAggregatiorImpl}} and > {{LogAggregationFileController}}). > On my fork I added a few more messages to those classes like: > - displaying the number of log aggregation cycles > - displaying the names of the files currently considered for log aggregation > by containers > - immediately displaying any exception caught (and sent to the RM in the > diagnostic messages) during the log aggregation process. > Those messages were quite useful for debugging if any issue occurs, but > otherwise it flooded the NM log file with these messages that are usually not > needed. I suggest to add (some of) these messages in DEBUG or TRACE level. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9676) Add DEBUG and TRACE level messages to AppLogAggregatorImpl and connected classes
[ https://issues.apache.org/jira/browse/YARN-9676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908046#comment-16908046 ] Adam Antal commented on YARN-9676: -- Cherry-picking failed because YARN-8273 only got in branch-3.2 and not branch-3.1. I'd rather omit the patch to branch-3.1. It is good as it is, thanks. I resolve this issue. > Add DEBUG and TRACE level messages to AppLogAggregatorImpl and connected > classes > > > Key: YARN-9676 > URL: https://issues.apache.org/jira/browse/YARN-9676 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Adam Antal >Assignee: Adam Antal >Priority: Major > Fix For: 3.3.0, 3.2.1 > > > During the development of the last items of YARN-6875, it was typically > difficult to extract information about the internal state of some log > aggregation related classes (e.g. {{AppLogAggregatiorImpl}} and > {{LogAggregationFileController}}). > On my fork I added a few more messages to those classes like: > - displaying the number of log aggregation cycles > - displaying the names of the files currently considered for log aggregation > by containers > - immediately displaying any exception caught (and sent to the RM in the > diagnostic messages) during the log aggregation process. > Those messages were quite useful for debugging if any issue occurs, but > otherwise it flooded the NM log file with these messages that are usually not > needed. I suggest to add (some of) these messages in DEBUG or TRACE level. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5857) TestLogAggregationService.testFixedSizeThreadPool fails intermittently on trunk
[ https://issues.apache.org/jira/browse/YARN-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908038#comment-16908038 ] Adam Antal commented on YARN-5857: -- The test timed out, because it was waiting for the latch, which only counted down when the lock is released. That has happened after 35sec, while the test has 30sec timeout. We should make sure all the threads are active when we count, so I'd rather put the latch-countDown part in the try part, right after the rLock.tryLock call. > TestLogAggregationService.testFixedSizeThreadPool fails intermittently on > trunk > --- > > Key: YARN-5857 > URL: https://issues.apache.org/jira/browse/YARN-5857 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Saxena >Assignee: Bilwa S T >Priority: Minor > Attachments: YARN-5857-001.patch, testFixedSizeThreadPool failure > reproduction > > > {noformat} > testFixedSizeThreadPool(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService) > Time elapsed: 0.11 sec <<< FAILURE! > java.lang.AssertionError: expected:<3> but was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testFixedSizeThreadPool(TestLogAggregationService.java:1139) > {noformat} > Refer to https://builds.apache.org/job/PreCommit-YARN-Build/13829/testReport/ -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9217) Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing
[ https://issues.apache.org/jira/browse/YARN-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908030#comment-16908030 ] Hadoop QA commented on YARN-9217: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 44s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 50s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 25s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 10s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 0 new + 224 unchanged - 2 fixed = 224 total (was 226) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 23s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 49s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m 39s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 57s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 99m 12s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.conf.TestYarnConfigurationFields | | | hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 | | JIRA Issue | YARN-9217 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12977695/YARN-9217.010.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux e766ba2afa77 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 08:28:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git
[jira] [Commented] (YARN-9488) Skip YARNFeatureNotEnabledException from ClientRMService
[ https://issues.apache.org/jira/browse/YARN-9488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908029#comment-16908029 ] Adam Antal commented on YARN-9488: -- +1 (non-binding) on patch v2. > Skip YARNFeatureNotEnabledException from ClientRMService > > > Key: YARN-9488 > URL: https://issues.apache.org/jira/browse/YARN-9488 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Attachments: YARN-9488-001.patch, YARN-9488-002.patch, > YARN-9488-002.patch > > > RM logs are accumulated with YARNFeatureNotEnabledException when running > DIstributed Shell jobs while {{ClientRMService#getResourceProfiles}} > {code} > 2019-04-16 07:10:47,699 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 0 on 8050, call Call#5 Retry#0 > org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getResourceProfiles > from 172.26.81.91:41198 > org.apache.hadoop.yarn.exceptions.YARNFeatureNotEnabledException: Resource > profile is not enabled, please enable resource profile feature before using > its functions. (by setting yarn.resourcemanager.resource-profiles.enabled to > true) > at > org.apache.hadoop.yarn.server.resourcemanager.resource.ResourceProfilesManagerImpl.checkAndThrowExceptionWhenFeatureDisabled(ResourceProfilesManagerImpl.java:191) > at > org.apache.hadoop.yarn.server.resourcemanager.resource.ResourceProfilesManagerImpl.getResourceProfiles(ResourceProfilesManagerImpl.java:214) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1833) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:670) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:665) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9746) RM should merge local config for token renewal
[ https://issues.apache.org/jira/browse/YARN-9746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junfan Zhang updated YARN-9746: --- Summary: RM should merge local config for token renewal (was: Rm should only rewrite partial jobConf passed by app when supporting multi-cluster token renew) > RM should merge local config for token renewal > -- > > Key: YARN-9746 > URL: https://issues.apache.org/jira/browse/YARN-9746 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Junfan Zhang >Priority: Major > Attachments: YARN-9746-01.patch > > > This issue links to YARN-5910. > When to support multi-cluster delegation token renew, the path of YARN-5910 > works in most scenarios. > But when intergrating with Oozie, we encounter some problems. In Oozie having > multi delegation tokens including HDFS_DELEGATION_TOKEN(another cluster HA > token) and MR_DELEGATION_TOKEN(Oozie mr launcher token), to support renew > another cluster's token, YARN-5910 was patched and related config was set. > The config is as follows > {code:xml} > > mapreduce.job.send-token-conf > > dfs.namenode.kerberos.principal|dfs.nameservices|^dfs.namenode.rpc-address.*$|^dfs.ha.namenodes.*$|^dfs.client.failover.proxy.provider.*$ > > > dfs.nameservices > > hadoop-clusterA-ns01,hadoop-clusterA-ns02,hadoop-clusterA-ns03,hadoop-clusterA-ns04,hadoop-clusterB-ns01,hadoop-clusterB-ns02,hadoop-clusterB-ns03,hadoop-clusterB-ns04 > > > dfs.ha.namenodes.hadoop-clusterB-ns01 > nn1,nn2 > > > > dfs.namenode.rpc-address.hadoop-clusterB-ns01.nn1 > namenode01-clusterB.hadoop:8020 > > > > dfs.namenode.rpc-address.hadoop-clusterB-ns01.nn2 > namenode02-clusterB.hadoop:8020 > > > > dfs.client.failover.proxy.provider.hadoop-clusterB-ns01 > > org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider > > {code} > However, the MR_DELEGATION_TOKEN could‘t be renewed, because of lacking some > config. Although we can set the required configurations through the app, this > is not a good idea. So i think rm should only rewrite the jobConf passed by > app to solve the above situation. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9217) Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing
[ https://issues.apache.org/jira/browse/YARN-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907985#comment-16907985 ] Peter Bacsko commented on YARN-9217: Rebased patch (again) + introduced new fail-fast property. > Nodemanager will fail to start if GPU is misconfigured on the node or GPU > drivers missing > - > > Key: YARN-9217 > URL: https://issues.apache.org/jira/browse/YARN-9217 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9217.001.patch, YARN-9217.002.patch, > YARN-9217.003.patch, YARN-9217.004.patch, YARN-9217.005.patch, > YARN-9217.006.patch, YARN-9217.007.patch, YARN-9217.008.patch, > YARN-9217.009.patch, YARN-9217.010.patch > > > Nodemanager will not start > 1. If Autodiscovery is enabled: > * If nvidia-smi path is misconfigured or the file does not exist. > * There is 0 GPU found > * If the file exists but it is not pointing to an nvidia-smi > * if the binary is ok but there is an IOException > 2. If the manually configured GPU devices are misconfigured > * Any index:minor number format failure will cause a problem > * 0 configured device will cause a problem > * NumberFormatException is not handled > It would be a better option to add warnings about the configuration, set 0 > available GPUs and let the node work and run non-gpu jobs. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9217) Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing
[ https://issues.apache.org/jira/browse/YARN-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-9217: --- Attachment: YARN-9217.010.patch > Nodemanager will fail to start if GPU is misconfigured on the node or GPU > drivers missing > - > > Key: YARN-9217 > URL: https://issues.apache.org/jira/browse/YARN-9217 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9217.001.patch, YARN-9217.002.patch, > YARN-9217.003.patch, YARN-9217.004.patch, YARN-9217.005.patch, > YARN-9217.006.patch, YARN-9217.007.patch, YARN-9217.008.patch, > YARN-9217.009.patch, YARN-9217.010.patch > > > Nodemanager will not start > 1. If Autodiscovery is enabled: > * If nvidia-smi path is misconfigured or the file does not exist. > * There is 0 GPU found > * If the file exists but it is not pointing to an nvidia-smi > * if the binary is ok but there is an IOException > 2. If the manually configured GPU devices are misconfigured > * Any index:minor number format failure will cause a problem > * 0 configured device will cause a problem > * NumberFormatException is not handled > It would be a better option to add warnings about the configuration, set 0 > available GPUs and let the node work and run non-gpu jobs. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup
[ https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907983#comment-16907983 ] Hadoop QA commented on YARN-9100: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 54s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 0 new + 10 unchanged - 6 fixed = 10 total (was 16) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 59s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m 2s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 68m 23s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9100 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12977691/YARN-9100-008.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 6025450bf327 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 3468164 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/24568/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24568/testReport/ | | Max. process+thread count | 448 (vs. ulim
[jira] [Updated] (YARN-9746) Rm should only rewrite partial jobConf passed by app when supporting multi-cluster token renew
[ https://issues.apache.org/jira/browse/YARN-9746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junfan Zhang updated YARN-9746: --- Description: This issue links to YARN-5910. When to support multi-cluster delegation token renew, the path of YARN-5910 works in most scenarios. But when intergrating with Oozie, we encounter some problems. In Oozie having multi delegation tokens including HDFS_DELEGATION_TOKEN(another cluster HA token) and MR_DELEGATION_TOKEN(Oozie mr launcher token), to support renew another cluster's token, YARN-5910 was patched and related config was set. The config is as follows {code:xml} mapreduce.job.send-token-conf dfs.namenode.kerberos.principal|dfs.nameservices|^dfs.namenode.rpc-address.*$|^dfs.ha.namenodes.*$|^dfs.client.failover.proxy.provider.*$ dfs.nameservices hadoop-clusterA-ns01,hadoop-clusterA-ns02,hadoop-clusterA-ns03,hadoop-clusterA-ns04,hadoop-clusterB-ns01,hadoop-clusterB-ns02,hadoop-clusterB-ns03,hadoop-clusterB-ns04 dfs.ha.namenodes.hadoop-clusterB-ns01 nn1,nn2 dfs.namenode.rpc-address.hadoop-clusterB-ns01.nn1 namenode01-clusterB.hadoop:8020 dfs.namenode.rpc-address.hadoop-clusterB-ns01.nn2 namenode02-clusterB.hadoop:8020 dfs.client.failover.proxy.provider.hadoop-clusterB-ns01 org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider {code} However, the MR_DELEGATION_TOKEN could‘t be renewed, because of lacking some config. Although we can set the required configurations through the app, this is not a good idea. So i think rm should only rewrite the jobConf passed by app to solve the above situation. was: This issue links to YARN-5910. When to support multi-cluster delegation token renew, the path of YARN-5910 works in most scenarios. But when intergrating with Oozie, we encounter some problems. In Oozie having multi delegation tokens including HDFS_DELEGATION_TOKEN(another cluster HA token) and MR_DELEGATION_TOKEN(Oozie mr launcher token), to support renew another cluster's token, YARN-5910 was patched and related config was set. The config is as follows {code:xml} mapreduce.job.send-token-conf dfs.namenode.kerberos.principal|dfs.nameservices|^dfs.namenode.rpc-address.*$|^dfs.ha.namenodes.*$|^dfs.client.failover.proxy.provider.*$ dfs.nameservices hadoop-clusterA-ns01,hadoop-clusterA-ns02,hadoop-clusterA-ns03,hadoop-clusterA-ns04,hadoop-clusterB-ns01,hadoop-clusterB-ns02,hadoop-clusterB-ns03,hadoop-clusterB-ns04 dfs.ha.namenodes.hadoop-clusterB-ns01 nn1,nn2 dfs.namenode.rpc-address.hadoop-clusterB-ns01.nn1 namenode01-clusterB.qiyi.hadoop:8020 dfs.namenode.rpc-address.hadoop-clusterB-ns01.nn2 namenode02-clusterB.qiyi.hadoop:8020 dfs.client.failover.proxy.provider.hadoop-clusterB-ns01 org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider {code} However, the MR_DELEGATION_TOKEN could‘t be renewed, because of lacking some config. Although we can set the required configurations through the app, this is not a good idea. So i think rm should only rewrite the jobConf passed by app to solve the above situation. > Rm should only rewrite partial jobConf passed by app when supporting > multi-cluster token renew > -- > > Key: YARN-9746 > URL: https://issues.apache.org/jira/browse/YARN-9746 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Junfan Zhang >Priority: Major > Attachments: YARN-9746-01.patch > > > This issue links to YARN-5910. > When to support multi-cluster delegation token renew, the path of YARN-5910 > works in most scenarios. > But when intergrating with Oozie, we encounter some problems. In Oozie having > multi delegation tokens including HDFS_DELEGATION_TOKEN(another cluster HA > token) and MR_DELEGATION_TOKEN(Oozie mr launcher token), to support renew > another cluster's token, YARN-5910 was patched and related config was set. > The config is as follows > {code:xml} > > mapreduce.job
[jira] [Updated] (YARN-8586) Extract log aggregation related fields and methods from RMAppImpl
[ https://issues.apache.org/jira/browse/YARN-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-8586: --- Attachment: YARN-8586-branch-3.1.001.patch > Extract log aggregation related fields and methods from RMAppImpl > - > > Key: YARN-8586 > URL: https://issues.apache.org/jira/browse/YARN-8586 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-8586-branch-3.1.001.patch, YARN-8586.001.patch, > YARN-8586.002.patch, YARN-8586.002.patch, YARN-8586.003.patch, > YARN-8586.004.patch, YARN-8586.branch-3.2.001.patch > > > Given that RMAppImpl is already above 2000 lines and it is very complex, as a > very simple > and straightforward step, all Log aggregation related fields and methods > could be extracted to a new class. > The clients of RMAppImpl may access the same methods and RMAppImpl would > delegate all those calls to the newly introduced class. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup
[ https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907951#comment-16907951 ] Szilard Nemeth commented on YARN-9100: -- Patch looks good in overall, I only found one minor bit of weirdness in the test code (TestGpuResourceAllocator): In TestGpuResourceAllocator#assertAllocatedGpu: We have: {code:java} assertEquals(1, allocation.getAllowedGPUs().size()); assertEquals(0, allocation.getDeniedGPUs().size()); Set allowedGPUs = allocation.getAllowedGPUs(); assertEquals(1, allowedGPUs.size()); GpuDevice allocatedGpu = (GpuDevice) allowedGPUs.toArray()[0]; assertEquals(expectedGpu, allocatedGpu); assertAssignmentInStateStore(expectedGpu, container); {code} I think the code block of {code:java} Set allowedGPUs = allocation.getAllowedGPUs(); assertEquals(1, allowedGPUs.size()); {code} is superfluous, as the size of allowed GPUs list is already checked above for the allocation. Please fix this minor bit and I think we are good to go! Thanks! > Add tests for GpuResourceAllocator and do minor code cleanup > > > Key: YARN-9100 > URL: https://issues.apache.org/jira/browse/YARN-9100 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9100-004.patch, YARN-9100-005.patch, > YARN-9100-006.patch, YARN-9100-007.patch, YARN-9100-008.patch, > YARN-9100.001.patch, YARN-9100.002.patch, YARN-9100.003.patch > > > Add tests for GpuResourceAllocator and do minor code cleanup > - Improved log and exception messages > - Added some new debug logs > - Some methods are named like *Copy, these are returning copies of internal > data structures. The word "copy" is just a noise in their name, so they have > been renamed. Additionally, the copied data structures modified to be > immutable. > - The waiting loop in method assignGpus were decoupled into a new class, > RetryCommand. > Some more words about the new class RetryCommand: > There are some similar waiting loops in the code in: AMRMClient, > AMRMClientAsync and even in GenericTestUtils (see waitFor method). > RetryCommand could be a future replacement of these duplicated code, as it > gives a solution to this waiting loop problem in a generic way. > The only downside of the usage of RetryCommand in GpuResourceAllocator > (startGpuAssignmentLoop) is the ugly exception handling part, but that's > solely because how Java deals with checked exceptions vs. lambdas. If there's > a cleaner way to solve the exception handling, I'm open for any suggestions. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup
[ https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-9100: - Attachment: YARN-9100-008.patch > Add tests for GpuResourceAllocator and do minor code cleanup > > > Key: YARN-9100 > URL: https://issues.apache.org/jira/browse/YARN-9100 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9100-004.patch, YARN-9100-005.patch, > YARN-9100-006.patch, YARN-9100-007.patch, YARN-9100-008.patch, > YARN-9100.001.patch, YARN-9100.002.patch, YARN-9100.003.patch > > > Add tests for GpuResourceAllocator and do minor code cleanup > - Improved log and exception messages > - Added some new debug logs > - Some methods are named like *Copy, these are returning copies of internal > data structures. The word "copy" is just a noise in their name, so they have > been renamed. Additionally, the copied data structures modified to be > immutable. > - The waiting loop in method assignGpus were decoupled into a new class, > RetryCommand. > Some more words about the new class RetryCommand: > There are some similar waiting loops in the code in: AMRMClient, > AMRMClientAsync and even in GenericTestUtils (see waitFor method). > RetryCommand could be a future replacement of these duplicated code, as it > gives a solution to this waiting loop problem in a generic way. > The only downside of the usage of RetryCommand in GpuResourceAllocator > (startGpuAssignmentLoop) is the ugly exception handling part, but that's > solely because how Java deals with checked exceptions vs. lambdas. If there's > a cleaner way to solve the exception handling, I'm open for any suggestions. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup
[ https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907946#comment-16907946 ] Szilard Nemeth commented on YARN-9100: -- Hi [~pbacsko]! patch006 had some trivial import conflicts, so I resolved them and uploaded patch007. Pending jenkins. > Add tests for GpuResourceAllocator and do minor code cleanup > > > Key: YARN-9100 > URL: https://issues.apache.org/jira/browse/YARN-9100 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9100-004.patch, YARN-9100-005.patch, > YARN-9100-006.patch, YARN-9100-007.patch, YARN-9100.001.patch, > YARN-9100.002.patch, YARN-9100.003.patch > > > Add tests for GpuResourceAllocator and do minor code cleanup > - Improved log and exception messages > - Added some new debug logs > - Some methods are named like *Copy, these are returning copies of internal > data structures. The word "copy" is just a noise in their name, so they have > been renamed. Additionally, the copied data structures modified to be > immutable. > - The waiting loop in method assignGpus were decoupled into a new class, > RetryCommand. > Some more words about the new class RetryCommand: > There are some similar waiting loops in the code in: AMRMClient, > AMRMClientAsync and even in GenericTestUtils (see waitFor method). > RetryCommand could be a future replacement of these duplicated code, as it > gives a solution to this waiting loop problem in a generic way. > The only downside of the usage of RetryCommand in GpuResourceAllocator > (startGpuAssignmentLoop) is the ugly exception handling part, but that's > solely because how Java deals with checked exceptions vs. lambdas. If there's > a cleaner way to solve the exception handling, I'm open for any suggestions. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup
[ https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-9100: - Attachment: YARN-9100-007.patch > Add tests for GpuResourceAllocator and do minor code cleanup > > > Key: YARN-9100 > URL: https://issues.apache.org/jira/browse/YARN-9100 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9100-004.patch, YARN-9100-005.patch, > YARN-9100-006.patch, YARN-9100-007.patch, YARN-9100.001.patch, > YARN-9100.002.patch, YARN-9100.003.patch > > > Add tests for GpuResourceAllocator and do minor code cleanup > - Improved log and exception messages > - Added some new debug logs > - Some methods are named like *Copy, these are returning copies of internal > data structures. The word "copy" is just a noise in their name, so they have > been renamed. Additionally, the copied data structures modified to be > immutable. > - The waiting loop in method assignGpus were decoupled into a new class, > RetryCommand. > Some more words about the new class RetryCommand: > There are some similar waiting loops in the code in: AMRMClient, > AMRMClientAsync and even in GenericTestUtils (see waitFor method). > RetryCommand could be a future replacement of these duplicated code, as it > gives a solution to this waiting loop problem in a generic way. > The only downside of the usage of RetryCommand in GpuResourceAllocator > (startGpuAssignmentLoop) is the ugly exception handling part, but that's > solely because how Java deals with checked exceptions vs. lambdas. If there's > a cleaner way to solve the exception handling, I'm open for any suggestions. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup
[ https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907944#comment-16907944 ] Szilard Nemeth commented on YARN-9100: -- Hi [~pbacsko]! About to review the latest patch. Agree to remove the RetryCommand from this patch. Quoting from the description: {quote} Some more words about the new class RetryCommand: There are some similar waiting loops in the code in: AMRMClient, AMRMClientAsync and even in GenericTestUtils (see waitFor method). RetryCommand could be a future replacement of these duplicated code, as it gives a solution to this waiting loop problem in a generic way. The only downside of the usage of RetryCommand in GpuResourceAllocator (startGpuAssignmentLoop) is the ugly exception handling part, but that's solely because how Java deals with checked exceptions vs. lambdas. If there's a cleaner way to solve the exception handling, I'm open for any suggestions. {quote} What about filing a follow-up jira to consolidate the retry logic of GpuResourceAllocator and several other classes like AMRMClient, AMRMClientAsync, GenericTestUtils, etc? I think what we had in patch003 for the RetryCommand can be a good starting point to extract the retry behaviour into one common class from the mentioned classes. What do you think? > Add tests for GpuResourceAllocator and do minor code cleanup > > > Key: YARN-9100 > URL: https://issues.apache.org/jira/browse/YARN-9100 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9100-004.patch, YARN-9100-005.patch, > YARN-9100-006.patch, YARN-9100.001.patch, YARN-9100.002.patch, > YARN-9100.003.patch > > > Add tests for GpuResourceAllocator and do minor code cleanup > - Improved log and exception messages > - Added some new debug logs > - Some methods are named like *Copy, these are returning copies of internal > data structures. The word "copy" is just a noise in their name, so they have > been renamed. Additionally, the copied data structures modified to be > immutable. > - The waiting loop in method assignGpus were decoupled into a new class, > RetryCommand. > Some more words about the new class RetryCommand: > There are some similar waiting loops in the code in: AMRMClient, > AMRMClientAsync and even in GenericTestUtils (see waitFor method). > RetryCommand could be a future replacement of these duplicated code, as it > gives a solution to this waiting loop problem in a generic way. > The only downside of the usage of RetryCommand in GpuResourceAllocator > (startGpuAssignmentLoop) is the ugly exception handling part, but that's > solely because how Java deals with checked exceptions vs. lambdas. If there's > a cleaner way to solve the exception handling, I'm open for any suggestions. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9217) Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing
[ https://issues.apache.org/jira/browse/YARN-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907942#comment-16907942 ] Szilard Nemeth commented on YARN-9217: -- Hi [~pbacsko]! {quote} Fundamental question: is this the way how we want to use thig plugin? Just asking because we might accidentally mask erratic behavior. Eg. a Hadoop user might think that he has a cluster with 10 GPUs. In reality, the plugin failed to detect some cards, and only 5 NMs support GPU scheduling. If it's not explicitly displayed, the user might be under the impression that 10 GPUs are ready to run YARN workloads. This can be very misleading. At the very least, a fail-fast method should be considered. {quote} I agree with your approach on the fail-fast config flag so please fix the TODO and upload a new patch, then I can start reviewing it! Thanks! > Nodemanager will fail to start if GPU is misconfigured on the node or GPU > drivers missing > - > > Key: YARN-9217 > URL: https://issues.apache.org/jira/browse/YARN-9217 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9217.001.patch, YARN-9217.002.patch, > YARN-9217.003.patch, YARN-9217.004.patch, YARN-9217.005.patch, > YARN-9217.006.patch, YARN-9217.007.patch, YARN-9217.008.patch, > YARN-9217.009.patch > > > Nodemanager will not start > 1. If Autodiscovery is enabled: > * If nvidia-smi path is misconfigured or the file does not exist. > * There is 0 GPU found > * If the file exists but it is not pointing to an nvidia-smi > * if the binary is ok but there is an IOException > 2. If the manually configured GPU devices are misconfigured > * Any index:minor number format failure will cause a problem > * 0 configured device will cause a problem > * NumberFormatException is not handled > It would be a better option to add warnings about the configuration, set 0 > available GPUs and let the node work and run non-gpu jobs. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9748) Allow capacity-scheduler configuration on HDFS
[ https://issues.apache.org/jira/browse/YARN-9748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated YARN-9748: --- Description: Improvement: Support auto reload from hdfs > Allow capacity-scheduler configuration on HDFS > -- > > Key: YARN-9748 > URL: https://issues.apache.org/jira/browse/YARN-9748 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler, capacityscheduler >Affects Versions: 3.1.2 >Reporter: zhoukang >Priority: Major > > Improvement: > Support auto reload from hdfs -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org