[jira] [Commented] (YARN-10297) TestContinuousScheduling#testFairSchedulerContinuousSchedulingInitTime fails intermittently
[ https://issues.apache.org/jira/browse/YARN-10297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136293#comment-17136293 ] Manikandan R commented on YARN-10297: - [~jhung] Patch LGTM. Can you please take a look and commit? > TestContinuousScheduling#testFairSchedulerContinuousSchedulingInitTime fails > intermittently > --- > > Key: YARN-10297 > URL: https://issues.apache.org/jira/browse/YARN-10297 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Hung >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-10297.001.patch, YARN-10297.002.patch > > > After YARN-6492, testFairSchedulerContinuousSchedulingInitTime fails > intermittently when running {{mvn test -Dtest=TestContinuousScheduling}} > {noformat}[INFO] Running > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling > [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 6.682 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling > [ERROR] > testFairSchedulerContinuousSchedulingInitTime(org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling) > Time elapsed: 0.194 s <<< ERROR! > org.apache.hadoop.metrics2.MetricsException: Metrics source > PartitionQueueMetrics,partition= already exists! > at > org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:152) > at > org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:125) > at > org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:229) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics.getPartitionMetrics(QueueMetrics.java:362) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics.incrPendingResources(QueueMetrics.java:601) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.updatePendingResources(AppSchedulingInfo.java:388) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.internalAddResourceRequests(AppSchedulingInfo.java:320) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.internalAddResourceRequests(AppSchedulingInfo.java:347) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.updateResourceRequests(AppSchedulingInfo.java:183) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.updateResourceRequests(SchedulerApplicationAttempt.java:456) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:898) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling.testFairSchedulerContinuousSchedulingInitTime(TestContinuousScheduling.java:375) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10297) TestContinuousScheduling#testFairSchedulerContinuousSchedulingInitTime fails intermittently
[ https://issues.apache.org/jira/browse/YARN-10297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133916#comment-17133916 ] Manikandan R commented on YARN-10297: - Thanks [~Jim_Brennan]. LGTM. Please fix whitespace issues. > TestContinuousScheduling#testFairSchedulerContinuousSchedulingInitTime fails > intermittently > --- > > Key: YARN-10297 > URL: https://issues.apache.org/jira/browse/YARN-10297 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Hung >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-10297.001.patch > > > After YARN-6492, testFairSchedulerContinuousSchedulingInitTime fails > intermittently when running {{mvn test -Dtest=TestContinuousScheduling}} > {noformat}[INFO] Running > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling > [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 6.682 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling > [ERROR] > testFairSchedulerContinuousSchedulingInitTime(org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling) > Time elapsed: 0.194 s <<< ERROR! > org.apache.hadoop.metrics2.MetricsException: Metrics source > PartitionQueueMetrics,partition= already exists! > at > org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:152) > at > org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:125) > at > org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:229) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics.getPartitionMetrics(QueueMetrics.java:362) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics.incrPendingResources(QueueMetrics.java:601) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.updatePendingResources(AppSchedulingInfo.java:388) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.internalAddResourceRequests(AppSchedulingInfo.java:320) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.internalAddResourceRequests(AppSchedulingInfo.java:347) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.updateResourceRequests(AppSchedulingInfo.java:183) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.updateResourceRequests(SchedulerApplicationAttempt.java:456) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:898) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling.testFairSchedulerContinuousSchedulingInitTime(TestContinuousScheduling.java:375) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9964) Queue metrics turn negative when relabeling a node with running containers to default partition
[ https://issues.apache.org/jira/browse/YARN-9964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17123338#comment-17123338 ] Manikandan R commented on YARN-9964: [~jhung] YARN-6492 patch covered this fixes too. Can we close this? > Queue metrics turn negative when relabeling a node with running containers to > default partition > > > Key: YARN-9964 > URL: https://issues.apache.org/jira/browse/YARN-9964 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jonathan Hung >Priority: Major > > YARN-6467 changed queue metrics logic to only update certain metrics if it's > for default partition. But if an app runs containers in a labeled node, then > this node is moved to default partition, then the container is released, this > container's resource won't increment queue's allocated resource, but will > decrement. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-9964) Queue metrics turn negative when relabeling a node with running containers to default partition
[ https://issues.apache.org/jira/browse/YARN-9964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R reassigned YARN-9964: -- Assignee: Manikandan R > Queue metrics turn negative when relabeling a node with running containers to > default partition > > > Key: YARN-9964 > URL: https://issues.apache.org/jira/browse/YARN-9964 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jonathan Hung >Assignee: Manikandan R >Priority: Major > > YARN-6467 changed queue metrics logic to only update certain metrics if it's > for default partition. But if an app runs containers in a labeled node, then > this node is moved to default partition, then the container is released, this > container's resource won't increment queue's allocated resource, but will > decrement. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-9767) PartitionQueueMetrics Issues
[ https://issues.apache.org/jira/browse/YARN-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R resolved YARN-9767. Resolution: Fixed > PartitionQueueMetrics Issues > > > Key: YARN-9767 > URL: https://issues.apache.org/jira/browse/YARN-9767 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-9767.001.patch > > > The intent of the Jira is to capture the issues/observations encountered as > part of YARN-6492 development separately for ease of tracking. > Observations: > Please refer this > https://issues.apache.org/jira/browse/YARN-6492?focusedCommentId=16904027&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16904027 > 1. Since partition info are being extracted from request and node, there is a > problem. For example, > > Node N has been mapped to Label X (Non exclusive). Queue A has been > configured with ANY Node label. App A requested resources from Queue A and > its containers ran on Node N for some reasons. During > AbstractCSQueue#allocateResource call, Node partition (using SchedulerNode ) > would get used for calculation. Lets say allocate call has been fired for 3 > containers of 1 GB each, then > a. PartitionDefault * queue A -> pending mb is 3 GB > b. PartitionX * queue A -> pending mb is -3 GB > > is the outcome. Because app request has been fired without any label > specification and #a metrics has been derived. After allocation is over, > pending resources usually gets decreased. When this happens, it use node > partition info. hence #b metrics has derived. > > Given this kind of situation, We will need to put some thoughts on achieving > the metrics correctly. > > 2. Though the intent of this jira is to do Partition Queue Metrics, we would > like to retain the existing Queue Metrics for backward compatibility (as you > can see from jira's discussion). > With this patch and YARN-9596 patch, queuemetrics (for queue's) would be > overridden either with some specific partition values or default partition > values. It could be vice - versa as well. For example, after the queues (say > queue A) has been initialised with some min and max cap and also with node > label's min and max cap, Queuemetrics (availableMB) for queue A return values > based on node label's cap config. > I've been working on these observations to provide a fix and attached > .005.WIP.patch. Focus of .005.WIP.patch is to ensure availableMB, > availableVcores is correct (Please refer above #2 observation). Added more > asserts in{{testQueueMetricsWithLabelsOnDefaultLabelNode}} to ensure fix for > #2 is working properly. > Also one more thing to note is, user metrics for availableMB, availableVcores > at root queue was not there even before. Retained the same behaviour. User > metrics for availableMB, availableVcores is available only at child queue's > level and also with partitions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9767) PartitionQueueMetrics Issues
[ https://issues.apache.org/jira/browse/YARN-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17123337#comment-17123337 ] Manikandan R commented on YARN-9767: YARN-6492 patch covered this fixes too. Hence closing this. > PartitionQueueMetrics Issues > > > Key: YARN-9767 > URL: https://issues.apache.org/jira/browse/YARN-9767 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-9767.001.patch > > > The intent of the Jira is to capture the issues/observations encountered as > part of YARN-6492 development separately for ease of tracking. > Observations: > Please refer this > https://issues.apache.org/jira/browse/YARN-6492?focusedCommentId=16904027&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16904027 > 1. Since partition info are being extracted from request and node, there is a > problem. For example, > > Node N has been mapped to Label X (Non exclusive). Queue A has been > configured with ANY Node label. App A requested resources from Queue A and > its containers ran on Node N for some reasons. During > AbstractCSQueue#allocateResource call, Node partition (using SchedulerNode ) > would get used for calculation. Lets say allocate call has been fired for 3 > containers of 1 GB each, then > a. PartitionDefault * queue A -> pending mb is 3 GB > b. PartitionX * queue A -> pending mb is -3 GB > > is the outcome. Because app request has been fired without any label > specification and #a metrics has been derived. After allocation is over, > pending resources usually gets decreased. When this happens, it use node > partition info. hence #b metrics has derived. > > Given this kind of situation, We will need to put some thoughts on achieving > the metrics correctly. > > 2. Though the intent of this jira is to do Partition Queue Metrics, we would > like to retain the existing Queue Metrics for backward compatibility (as you > can see from jira's discussion). > With this patch and YARN-9596 patch, queuemetrics (for queue's) would be > overridden either with some specific partition values or default partition > values. It could be vice - versa as well. For example, after the queues (say > queue A) has been initialised with some min and max cap and also with node > label's min and max cap, Queuemetrics (availableMB) for queue A return values > based on node label's cap config. > I've been working on these observations to provide a fix and attached > .005.WIP.patch. Focus of .005.WIP.patch is to ensure availableMB, > availableVcores is correct (Please refer above #2 observation). Added more > asserts in{{testQueueMetricsWithLabelsOnDefaultLabelNode}} to ensure fix for > #2 is working properly. > Also one more thing to note is, user metrics for availableMB, availableVcores > at root queue was not there even before. Retained the same behaviour. User > metrics for availableMB, availableVcores is available only at child queue's > level and also with partitions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10297) TestContinuousScheduling#testFairSchedulerContinuousSchedulingInitTime fails intermittently
[ https://issues.apache.org/jira/browse/YARN-10297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120409#comment-17120409 ] Manikandan R commented on YARN-10297: - In this test case, I don't see a situation of rm being started twice. Can you try the other approach (setting mini cluster mode) in setup and see? > TestContinuousScheduling#testFairSchedulerContinuousSchedulingInitTime fails > intermittently > --- > > Key: YARN-10297 > URL: https://issues.apache.org/jira/browse/YARN-10297 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Hung >Priority: Major > > After YARN-6492, testFairSchedulerContinuousSchedulingInitTime fails > intermittently when running {{mvn test -Dtest=TestContinuousScheduling}} > {noformat}[INFO] Running > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling > [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 6.682 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling > [ERROR] > testFairSchedulerContinuousSchedulingInitTime(org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling) > Time elapsed: 0.194 s <<< ERROR! > org.apache.hadoop.metrics2.MetricsException: Metrics source > PartitionQueueMetrics,partition= already exists! > at > org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:152) > at > org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:125) > at > org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:229) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics.getPartitionMetrics(QueueMetrics.java:362) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics.incrPendingResources(QueueMetrics.java:601) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.updatePendingResources(AppSchedulingInfo.java:388) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.internalAddResourceRequests(AppSchedulingInfo.java:320) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.internalAddResourceRequests(AppSchedulingInfo.java:347) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.updateResourceRequests(AppSchedulingInfo.java:183) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.updateResourceRequests(SchedulerApplicationAttempt.java:456) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:898) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling.testFairSchedulerContinuousSchedulingInitTime(TestContinuousScheduling.java:375) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120407#comment-17120407 ] Manikandan R commented on YARN-6492: Ok, Makes sense. Patch LGTM. > Generate queue metrics for each partition > - > > Key: YARN-6492 > URL: https://issues.apache.org/jira/browse/YARN-6492 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Jonathan Hung >Assignee: Manikandan R >Priority: Major > Fix For: 3.2.2, 3.4.0, 3.3.1, 3.1.5 > > Attachments: PartitionQueueMetrics_default_partition.txt, > PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, > YARN-6492-branch-2.10.016.patch, YARN-6492-branch-2.10.019.patch, > YARN-6492-branch-2.8.014.patch, YARN-6492-branch-2.9.015.patch, > YARN-6492-branch-3.1.018.patch, YARN-6492-branch-3.2.017.patch, > YARN-6492-junits.patch, YARN-6492.001.patch, YARN-6492.002.patch, > YARN-6492.003.patch, YARN-6492.004.patch, YARN-6492.005.WIP.patch, > YARN-6492.006.WIP.patch, YARN-6492.007.WIP.patch, YARN-6492.008.WIP.patch, > YARN-6492.009.WIP.patch, YARN-6492.010.WIP.patch, YARN-6492.011.WIP.patch, > YARN-6492.012.WIP.patch, YARN-6492.013.patch, partition_metrics.txt > > > We are interested in having queue metrics for all partitions. Right now each > queue has one QueueMetrics object which captures metrics either in default > partition or across all partitions. (After YARN-6467 it will be in default > partition) > But having the partition metrics would be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10297) TestContinuousScheduling#testFairSchedulerContinuousSchedulingInitTime fails intermittently
[ https://issues.apache.org/jira/browse/YARN-10297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120303#comment-17120303 ] Manikandan R commented on YARN-10297: - Yes, we can make it as Synchronized. On test failures, When I tried to debug TestCapacitySchedulerAutoQueueCreation test failures, had come across two approaches to fix the problem. I fixed the problem by stopping the earlier created rm. Other approach is to use DefaultMetricsSystem.setMiniClusterMode(true); . I had come across this alternative approach in many other test cases. > TestContinuousScheduling#testFairSchedulerContinuousSchedulingInitTime fails > intermittently > --- > > Key: YARN-10297 > URL: https://issues.apache.org/jira/browse/YARN-10297 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Hung >Priority: Major > > After YARN-6492, testFairSchedulerContinuousSchedulingInitTime fails > intermittently when running {{mvn test -Dtest=TestContinuousScheduling}} > {noformat}[INFO] Running > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling > [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 6.682 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling > [ERROR] > testFairSchedulerContinuousSchedulingInitTime(org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling) > Time elapsed: 0.194 s <<< ERROR! > org.apache.hadoop.metrics2.MetricsException: Metrics source > PartitionQueueMetrics,partition= already exists! > at > org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:152) > at > org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:125) > at > org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:229) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics.getPartitionMetrics(QueueMetrics.java:362) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics.incrPendingResources(QueueMetrics.java:601) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.updatePendingResources(AppSchedulingInfo.java:388) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.internalAddResourceRequests(AppSchedulingInfo.java:320) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.internalAddResourceRequests(AppSchedulingInfo.java:347) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.updateResourceRequests(AppSchedulingInfo.java:183) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.updateResourceRequests(SchedulerApplicationAttempt.java:456) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:898) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling.testFairSchedulerContinuousSchedulingInitTime(TestContinuousScheduling.java:375) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120300#comment-17120300 ] Manikandan R commented on YARN-6492: {quote}For the branch-2.10 patch, do we need to remove the{\quote} This method has been used only in test cases. Yes, we can remove the method itself and modify test cases as well. > Generate queue metrics for each partition > - > > Key: YARN-6492 > URL: https://issues.apache.org/jira/browse/YARN-6492 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Jonathan Hung >Assignee: Manikandan R >Priority: Major > Fix For: 3.2.2, 3.4.0, 3.3.1, 3.1.5 > > Attachments: PartitionQueueMetrics_default_partition.txt, > PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, > YARN-6492-branch-2.10.016.patch, YARN-6492-branch-2.8.014.patch, > YARN-6492-branch-2.9.015.patch, YARN-6492-branch-3.1.018.patch, > YARN-6492-branch-3.2.017.patch, YARN-6492-junits.patch, YARN-6492.001.patch, > YARN-6492.002.patch, YARN-6492.003.patch, YARN-6492.004.patch, > YARN-6492.005.WIP.patch, YARN-6492.006.WIP.patch, YARN-6492.007.WIP.patch, > YARN-6492.008.WIP.patch, YARN-6492.009.WIP.patch, YARN-6492.010.WIP.patch, > YARN-6492.011.WIP.patch, YARN-6492.012.WIP.patch, YARN-6492.013.patch, > partition_metrics.txt > > > We are interested in having queue metrics for all partitions. Right now each > queue has one QueueMetrics object which captures metrics either in default > partition or across all partitions. (After YARN-6467 it will be in default > partition) > But having the partition metrics would be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-6492: --- Attachment: YARN-6492-branch-2.10.016.patch YARN-6492-branch-2.9.015.patch YARN-6492-branch-2.8.014.patch > Generate queue metrics for each partition > - > > Key: YARN-6492 > URL: https://issues.apache.org/jira/browse/YARN-6492 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Jonathan Hung >Assignee: Manikandan R >Priority: Major > Fix For: 3.4.0 > > Attachments: PartitionQueueMetrics_default_partition.txt, > PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, > YARN-6492-branch-2.10.016.patch, YARN-6492-branch-2.8.014.patch, > YARN-6492-branch-2.9.015.patch, YARN-6492-junits.patch, YARN-6492.001.patch, > YARN-6492.002.patch, YARN-6492.003.patch, YARN-6492.004.patch, > YARN-6492.005.WIP.patch, YARN-6492.006.WIP.patch, YARN-6492.007.WIP.patch, > YARN-6492.008.WIP.patch, YARN-6492.009.WIP.patch, YARN-6492.010.WIP.patch, > YARN-6492.011.WIP.patch, YARN-6492.012.WIP.patch, YARN-6492.013.patch, > partition_metrics.txt > > > We are interested in having queue metrics for all partitions. Right now each > queue has one QueueMetrics object which captures metrics either in default > partition or across all partitions. (After YARN-6467 it will be in default > partition) > But having the partition metrics would be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119731#comment-17119731 ] Manikandan R commented on YARN-6492: [~jhung] Thanks. Attached patch for branches 2.8, 2.9 & 2.10. Following methods needs to be checked only in branch-2.8. QueueMetrics#allocateResources(String partition, String user, Resource res) QueueMetrics#releaseResources(String partition, String user, Resource res). > Generate queue metrics for each partition > - > > Key: YARN-6492 > URL: https://issues.apache.org/jira/browse/YARN-6492 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Jonathan Hung >Assignee: Manikandan R >Priority: Major > Fix For: 3.4.0 > > Attachments: PartitionQueueMetrics_default_partition.txt, > PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, > YARN-6492-junits.patch, YARN-6492.001.patch, YARN-6492.002.patch, > YARN-6492.003.patch, YARN-6492.004.patch, YARN-6492.005.WIP.patch, > YARN-6492.006.WIP.patch, YARN-6492.007.WIP.patch, YARN-6492.008.WIP.patch, > YARN-6492.009.WIP.patch, YARN-6492.010.WIP.patch, YARN-6492.011.WIP.patch, > YARN-6492.012.WIP.patch, YARN-6492.013.patch, partition_metrics.txt > > > We are interested in having queue metrics for all partitions. Right now each > queue has one QueueMetrics object which captures metrics either in default > partition or across all partitions. (After YARN-6467 it will be in default > partition) > But having the partition metrics would be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116842#comment-17116842 ] Manikandan R edited comment on YARN-6492 at 5/26/20, 3:56 PM: -- Reg line 2542, retained the old asserts and added some more asserts to ensure pending resources metrics are correct when containers pending on "default" partition has been allocated with "x" partition. Fixed whitespaces. TestCapacitySchedulerAutoQueueCreation test failures has been fixed by stopping the old rm before starting up new rm in some test cases. This change has opened up couple of more asserts failures. Fixed those as well by using correct rm and cs variables. Sure, can upload. After committing to trunk? was (Author: maniraj...@gmail.com): Reg line 2542, retained the old asserts and added some more asserts to ensure pending resources metrics are correct when containers pending on "default" partition has been allocated with "x" partition. Fixed whitespaces. TestCapacitySchedulerAutoQueueCreation test failures has been fixed by stop the old rm before starting up new rm in some test cases. This change has opened up couple of more asserts failures. Fixed those as well by using correct rm and cs variables. Sure, can upload. After committing to trunk? > Generate queue metrics for each partition > - > > Key: YARN-6492 > URL: https://issues.apache.org/jira/browse/YARN-6492 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Jonathan Hung >Assignee: Manikandan R >Priority: Major > Attachments: PartitionQueueMetrics_default_partition.txt, > PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, > YARN-6492-junits.patch, YARN-6492.001.patch, YARN-6492.002.patch, > YARN-6492.003.patch, YARN-6492.004.patch, YARN-6492.005.WIP.patch, > YARN-6492.006.WIP.patch, YARN-6492.007.WIP.patch, YARN-6492.008.WIP.patch, > YARN-6492.009.WIP.patch, YARN-6492.010.WIP.patch, YARN-6492.011.WIP.patch, > YARN-6492.012.WIP.patch, partition_metrics.txt > > > We are interested in having queue metrics for all partitions. Right now each > queue has one QueueMetrics object which captures metrics either in default > partition or across all partitions. (After YARN-6467 it will be in default > partition) > But having the partition metrics would be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116842#comment-17116842 ] Manikandan R commented on YARN-6492: Reg line 2542, retained the old asserts and added some more asserts to ensure pending resources metrics are correct when containers pending on "default" partition has been allocated with "x" partition. Fixed whitespaces. TestCapacitySchedulerAutoQueueCreation test failures has been fixed by stop the old rm before starting up new rm in some test cases. This change has opened up couple of more asserts failures. Fixed those as well by using correct rm and cs variables. Sure, can upload. After committing to trunk? > Generate queue metrics for each partition > - > > Key: YARN-6492 > URL: https://issues.apache.org/jira/browse/YARN-6492 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Jonathan Hung >Assignee: Manikandan R >Priority: Major > Attachments: PartitionQueueMetrics_default_partition.txt, > PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, > YARN-6492-junits.patch, YARN-6492.001.patch, YARN-6492.002.patch, > YARN-6492.003.patch, YARN-6492.004.patch, YARN-6492.005.WIP.patch, > YARN-6492.006.WIP.patch, YARN-6492.007.WIP.patch, YARN-6492.008.WIP.patch, > YARN-6492.009.WIP.patch, YARN-6492.010.WIP.patch, YARN-6492.011.WIP.patch, > YARN-6492.012.WIP.patch, partition_metrics.txt > > > We are interested in having queue metrics for all partitions. Right now each > queue has one QueueMetrics object which captures metrics either in default > partition or across all partitions. (After YARN-6467 it will be in default > partition) > But having the partition metrics would be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-6492: --- Attachment: YARN-6492.012.WIP.patch > Generate queue metrics for each partition > - > > Key: YARN-6492 > URL: https://issues.apache.org/jira/browse/YARN-6492 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Jonathan Hung >Assignee: Manikandan R >Priority: Major > Attachments: PartitionQueueMetrics_default_partition.txt, > PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, > YARN-6492-junits.patch, YARN-6492.001.patch, YARN-6492.002.patch, > YARN-6492.003.patch, YARN-6492.004.patch, YARN-6492.005.WIP.patch, > YARN-6492.006.WIP.patch, YARN-6492.007.WIP.patch, YARN-6492.008.WIP.patch, > YARN-6492.009.WIP.patch, YARN-6492.010.WIP.patch, YARN-6492.011.WIP.patch, > YARN-6492.012.WIP.patch, partition_metrics.txt > > > We are interested in having queue metrics for all partitions. Right now each > queue has one QueueMetrics object which captures metrics either in default > partition or across all partitions. (After YARN-6467 it will be in default > partition) > But having the partition metrics would be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-6492: --- Attachment: YARN-6492.011.WIP.patch > Generate queue metrics for each partition > - > > Key: YARN-6492 > URL: https://issues.apache.org/jira/browse/YARN-6492 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Jonathan Hung >Assignee: Manikandan R >Priority: Major > Attachments: PartitionQueueMetrics_default_partition.txt, > PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, > YARN-6492-junits.patch, YARN-6492.001.patch, YARN-6492.002.patch, > YARN-6492.003.patch, YARN-6492.004.patch, YARN-6492.005.WIP.patch, > YARN-6492.006.WIP.patch, YARN-6492.007.WIP.patch, YARN-6492.008.WIP.patch, > YARN-6492.009.WIP.patch, YARN-6492.010.WIP.patch, YARN-6492.011.WIP.patch, > partition_metrics.txt > > > We are interested in having queue metrics for all partitions. Right now each > queue has one QueueMetrics object which captures metrics either in default > partition or across all partitions. (After YARN-6467 it will be in default > partition) > But having the partition metrics would be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116138#comment-17116138 ] Manikandan R commented on YARN-6492: Addressed all comments, covered almost all checkstyle/whitespace/javadoc/findbugs/asflicense issues. TestCapacitySchedulerAutoQueueCreation failures are happening only for test cases which tries to mock the rm twice. TestCapacitySchedulerAutoQueueCreation#setupSchedulerInstance does this. Will need to check the reason even though MockRM shutdown the metrics system and clear queue metrics. {quote}On line 2539 of TestNodeLabelContainerAllocation, should {quote} Behaviour was same even without this patch when user metrics has been enabled. Attached junits patch explains this. However, did some changes in LeafQueue.java as part of this patch as well. In both ways, UsersManager#computeUserLimit() does the actual calculation. Can we handle this separately? Good to see the positive results on live cluster. Yes, we can handle CSQueueMetrics for partitioned metrics in separate JIRA. > Generate queue metrics for each partition > - > > Key: YARN-6492 > URL: https://issues.apache.org/jira/browse/YARN-6492 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Jonathan Hung >Assignee: Manikandan R >Priority: Major > Attachments: PartitionQueueMetrics_default_partition.txt, > PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, > YARN-6492-junits.patch, YARN-6492.001.patch, YARN-6492.002.patch, > YARN-6492.003.patch, YARN-6492.004.patch, YARN-6492.005.WIP.patch, > YARN-6492.006.WIP.patch, YARN-6492.007.WIP.patch, YARN-6492.008.WIP.patch, > YARN-6492.009.WIP.patch, YARN-6492.010.WIP.patch, partition_metrics.txt > > > We are interested in having queue metrics for all partitions. Right now each > queue has one QueueMetrics object which captures metrics either in default > partition or across all partitions. (After YARN-6467 it will be in default > partition) > But having the partition metrics would be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-6492: --- Attachment: YARN-6492-junits.patch > Generate queue metrics for each partition > - > > Key: YARN-6492 > URL: https://issues.apache.org/jira/browse/YARN-6492 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Jonathan Hung >Assignee: Manikandan R >Priority: Major > Attachments: PartitionQueueMetrics_default_partition.txt, > PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, > YARN-6492-junits.patch, YARN-6492.001.patch, YARN-6492.002.patch, > YARN-6492.003.patch, YARN-6492.004.patch, YARN-6492.005.WIP.patch, > YARN-6492.006.WIP.patch, YARN-6492.007.WIP.patch, YARN-6492.008.WIP.patch, > YARN-6492.009.WIP.patch, YARN-6492.010.WIP.patch, partition_metrics.txt > > > We are interested in having queue metrics for all partitions. Right now each > queue has one QueueMetrics object which captures metrics either in default > partition or across all partitions. (After YARN-6467 it will be in default > partition) > But having the partition metrics would be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-6492: --- Attachment: YARN-6492.010.WIP.patch > Generate queue metrics for each partition > - > > Key: YARN-6492 > URL: https://issues.apache.org/jira/browse/YARN-6492 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Jonathan Hung >Assignee: Manikandan R >Priority: Major > Attachments: PartitionQueueMetrics_default_partition.txt, > PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, > YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, > YARN-6492.004.patch, YARN-6492.005.WIP.patch, YARN-6492.006.WIP.patch, > YARN-6492.007.WIP.patch, YARN-6492.008.WIP.patch, YARN-6492.009.WIP.patch, > YARN-6492.010.WIP.patch, partition_metrics.txt > > > We are interested in having queue metrics for all partitions. Right now each > queue has one QueueMetrics object which captures metrics either in default > partition or across all partitions. (After YARN-6467 it will be in default > partition) > But having the partition metrics would be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17114129#comment-17114129 ] Manikandan R commented on YARN-6492: [~jhung] Thanks for your quick turnaround. Addressed all points except last three comments in .10 patch. {quote}On line 2539 of TestNodeLabelContainerAllocation, should{\quote} {quote}On line 2566, how is node1 getting 8 containers if queue A's max capacity is only 50% of 10GB = 5GB?{\quote} Label 'x' is non-exclusive and because of IGNORE_PARTITION_EXCLUSIVITY scheduling mode calculation in UsersManager#computeUserLimit()? In the meantime, Will dig more on these 3 comments in detail. > Generate queue metrics for each partition > - > > Key: YARN-6492 > URL: https://issues.apache.org/jira/browse/YARN-6492 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Jonathan Hung >Assignee: Manikandan R >Priority: Major > Attachments: PartitionQueueMetrics_default_partition.txt, > PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, > YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, > YARN-6492.004.patch, YARN-6492.005.WIP.patch, YARN-6492.006.WIP.patch, > YARN-6492.007.WIP.patch, YARN-6492.008.WIP.patch, YARN-6492.009.WIP.patch, > partition_metrics.txt > > > We are interested in having queue metrics for all partitions. Right now each > queue has one QueueMetrics object which captures metrics either in default > partition or across all partitions. (After YARN-6467 it will be in default > partition) > But having the partition metrics would be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-6492: --- Attachment: YARN-6492.009.WIP.patch > Generate queue metrics for each partition > - > > Key: YARN-6492 > URL: https://issues.apache.org/jira/browse/YARN-6492 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Jonathan Hung >Assignee: Manikandan R >Priority: Major > Attachments: PartitionQueueMetrics_default_partition.txt, > PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, > YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, > YARN-6492.004.patch, YARN-6492.005.WIP.patch, YARN-6492.006.WIP.patch, > YARN-6492.007.WIP.patch, YARN-6492.008.WIP.patch, YARN-6492.009.WIP.patch, > partition_metrics.txt > > > We are interested in having queue metrics for all partitions. Right now each > queue has one QueueMetrics object which captures metrics either in default > partition or across all partitions. (After YARN-6467 it will be in default > partition) > But having the partition metrics would be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17112371#comment-17112371 ] Manikandan R commented on YARN-6492: [~jhung] [~epayne] Attached .009 patch based on our discussions: # Retain existing default Queue Metrics behaviour (after YARN-6467). # Partition Metrics # Partition * Queue Metrics # Partition * Queue * User Metrics (Only If USER METRICS has been enabled). Please review and share your feedback. > Generate queue metrics for each partition > - > > Key: YARN-6492 > URL: https://issues.apache.org/jira/browse/YARN-6492 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Jonathan Hung >Assignee: Manikandan R >Priority: Major > Attachments: PartitionQueueMetrics_default_partition.txt, > PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, > YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, > YARN-6492.004.patch, YARN-6492.005.WIP.patch, YARN-6492.006.WIP.patch, > YARN-6492.007.WIP.patch, YARN-6492.008.WIP.patch, partition_metrics.txt > > > We are interested in having queue metrics for all partitions. Right now each > queue has one QueueMetrics object which captures metrics either in default > partition or across all partitions. (After YARN-6467 it will be in default > partition) > But having the partition metrics would be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107476#comment-17107476 ] Manikandan R commented on YARN-6492: Thanks for sharing your views. I spent good amount of time based on different notion to develop the patch. Now, I will need to shift my mind completely to modify the patch based on the new conclusions. For example, In recent patch, no metrics method would do "if partition is default" check, which is something needs to be retained for backward compatibility and for Partition Queue Metrics computation, it should happen for all partitions at a high level. Will work on the patch and update asap. Thanks. > Generate queue metrics for each partition > - > > Key: YARN-6492 > URL: https://issues.apache.org/jira/browse/YARN-6492 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Jonathan Hung >Assignee: Manikandan R >Priority: Major > Attachments: PartitionQueueMetrics_default_partition.txt, > PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, > YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, > YARN-6492.004.patch, YARN-6492.005.WIP.patch, YARN-6492.006.WIP.patch, > YARN-6492.007.WIP.patch, YARN-6492.008.WIP.patch, partition_metrics.txt > > > We are interested in having queue metrics for all partitions. Right now each > queue has one QueueMetrics object which captures metrics either in default > partition or across all partitions. (After YARN-6467 it will be in default > partition) > But having the partition metrics would be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10259) Reserved Containers not allocated from available space of other nodes in CandidateNodeSet in MultiNodePlacement
[ https://issues.apache.org/jira/browse/YARN-10259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107328#comment-17107328 ] Manikandan R commented on YARN-10259: - Minor nit: To make it concise, *if* (! iter.hasNext()) { can be used to avoid continue; May be taken up later if not possible now. > Reserved Containers not allocated from available space of other nodes in > CandidateNodeSet in MultiNodePlacement > --- > > Key: YARN-10259 > URL: https://issues.apache.org/jira/browse/YARN-10259 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0, 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-10259-001.patch, YARN-10259-002.patch, > YARN-10259-003.patch > > > Reserved Containers are not allocated from the available space of other nodes > in CandidateNodeSet in MultiNodePlacement. > *Repro:* > 1. MultiNode Placement Enabled. > 2. Two nodes h1 and h2 with 8GB > 3. Submit app1 AM (5GB) which gets placed in h1 and app2 AM (5GB) which gets > placed in h2. > 4. Submit app3 AM which is reserved in h1 > 5. Kill app2 which frees space in h2. > 6. app3 AM never gets ALLOCATED > RM logs shows YARN-8127 fix rejecting the allocation proposal for app3 AM on > h2 as it expects the assignment to be on same node where reservation has > happened. > {code} > 2020-05-05 18:49:37,264 DEBUG [AsyncDispatcher event handler] > scheduler.SchedulerApplicationAttempt > (SchedulerApplicationAttempt.java:commonReserve(573)) - Application attempt > appattempt_1588684773609_0003_01 reserved container > container_1588684773609_0003_01_01 on node host: h1:1234 #containers=1 > available= used=. This attempt > currently has 1 reserved containers at priority 0; currentReservation > > 2020-05-05 18:49:37,264 INFO [AsyncDispatcher event handler] > fica.FiCaSchedulerApp (FiCaSchedulerApp.java:apply(670)) - Reserved > container=container_1588684773609_0003_01_01, on node=host: h1:1234 > #containers=1 available= used= > with resource= >RESERVED=[(Application=appattempt_1588684773609_0003_01; > Node=h1:1234; Resource=)] > > 2020-05-05 18:49:38,283 DEBUG [Time-limited test] > allocator.RegularContainerAllocator > (RegularContainerAllocator.java:assignContainer(514)) - assignContainers: > node=h2 application=application_1588684773609_0003 priority=0 > pendingAsk=,repeat=1> > type=OFF_SWITCH > 2020-05-05 18:49:38,285 DEBUG [Time-limited test] fica.FiCaSchedulerApp > (FiCaSchedulerApp.java:commonCheckContainerAllocation(371)) - Try to allocate > from reserved container container_1588684773609_0003_01_01, but node is > not reserved >ALLOCATED=[(Application=appattempt_1588684773609_0003_01; > Node=h2:1234; Resource=)] > {code} > Attached testcase which reproduces the issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10154) CS Dynamic Queues cannot be configured with absolute resources
[ https://issues.apache.org/jira/browse/YARN-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105962#comment-17105962 ] Manikandan R commented on YARN-10154: - Looks good to me. Thanks [~prabhujoseph] and [~sunilg] > CS Dynamic Queues cannot be configured with absolute resources > -- > > Key: YARN-10154 > URL: https://issues.apache.org/jira/browse/YARN-10154 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.3 >Reporter: Sunil G >Assignee: Manikandan R >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-10154.001.patch, YARN-10154.002.patch, > YARN-10154.003.patch, YARN-10154.addendum-001.patch, > YARN-10154.addendum-002.patch, YARN-10154.addendum-003.patch, > YARN-10154.addendum-004.patch > > > In CS, ManagedParent Queue and its template cannot take absolute resource > value like > [memory=8192,vcores=8] > Thsi Jira is to track and improve the configuration reading module of > DynamicQueue to support absolute resource values. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103876#comment-17103876 ] Manikandan R commented on YARN-6492: YARN-6467 would be available as given below (default partition metrics with queue wise breakup ) {code:java} "name" : "Hadoop:service=ResourceManager,name=PartitionQueueMetrics,partition=default,q0=root,q1=a" ...{code} from this Jira onwards. So, this Jira won't reverse YARN-6467 as such. Admins interested in "default" partition metrics can use this JMX o/p for their analysis. At the same time, we would be retaining the below "original queuemetrics computation" as given below {code:java} "name" : "Hadoop:service=ResourceManager,name=QueueMetrics,q0=root,q1=a" ...{code} This "original queuemetrics computation" doesn't consider partition at all into its computation. Purely, from Queue perspective, helps admin to view the metrics only from Queue angle which was happening before YARN-6467. Thoughts? > Generate queue metrics for each partition > - > > Key: YARN-6492 > URL: https://issues.apache.org/jira/browse/YARN-6492 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Jonathan Hung >Assignee: Manikandan R >Priority: Major > Attachments: PartitionQueueMetrics_default_partition.txt, > PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, > YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, > YARN-6492.004.patch, YARN-6492.005.WIP.patch, YARN-6492.006.WIP.patch, > YARN-6492.007.WIP.patch, YARN-6492.008.WIP.patch, partition_metrics.txt > > > We are interested in having queue metrics for all partitions. Right now each > queue has one QueueMetrics object which captures metrics either in default > partition or across all partitions. (After YARN-6467 it will be in default > partition) > But having the partition metrics would be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103138#comment-17103138 ] Manikandan R edited comment on YARN-6492 at 5/9/20, 7:07 AM: - Thanks [~jhung] and [~epayne] for your support. I would like to clear some things at high level, especially on scope/requirements of this Jira and revisit the background (thought process) on how all these related Jira has been created to ensure that we are on same page: YARN-6467 computes metrics only for default partition. It has been created as an interim step towards the major goal of "Providing Metrics at Partition Level" to the customers. Major goal is nothing but this JIRA. Since YARN-6467 is stepping stone for this JIRA, it has been coded in a way that it should easily accomodate this Jira changes in a simplistic way (For example, Just removing if(partition is default) check inside each metric computation method expected to take care most of the things and no more changes required on collar side). Though YARN-6467 covers some aspects, it had created confusion (for the queue's associated with multiple partitions) as well. Original QueueMetrics computation behaviour has been changed. Original QueueMetrics computation is nothing but the metrics computation only from Queue perspective irrespective of how many partitions it has been associated to and nothing to do with Partitions. It started providing metrics only for "default" partition by replacing the original behaviour. Another reasons for taking up this path is, this Jira expected to go into the trunk immediately after YARN-6467 (as planned :) ) and hence there won't be any inconsistency in original queue metrics computation behaviour, but it didn't happened. So, whenever we said "backward compatibility" we referred to this Original QueueMetrics computation, not "existing QueueMetrics should still only contain metrics for default partition". In other words, Original QueueMetrics computation is nothing but the code/behaviour before YARN-6467. Now, let me explain scope of this JIRA. We would like to achieve the following things: # Partition * Queue Metrics: A partition can be associated with many queues. So we need to break up, hence we need Partition * Queue metrics. Proposed structure is PartitionMetric (labelX) QueueMetric (A) metrics Usermetrics QueueMetric (A1) metrics Usermetrics QueueMetric (A2) metrics Usermetrics QueueMetric (B) metrics Usermetrics PartitionMetric (labelY) QueueMetric (A) QueueMetric (A1) QueueMetric (A2) QueueMetric (B) … {{QueueMetrics#getPartitionQueueMetrics }} takes care of this registration into Metric system and use this object for all metric computations. Sample JMX o/p is {code:java} "name" : "Hadoop:service=ResourceManager,name=PartitionQueueMetrics,partition=x,q0=root,q1=a" ...{code} 2. Partition metrics: Partition level metrics computation. This can help Admins to analyse the usage at Partition level. Proposed structure is PartitionMetric (labelX) metrics PartitionMetric (labelY) metrics {{PartitionQueueMetrics#getPartitionQueueMetrics }} takes care of this registration into Metric system and use this object for all metric computations. Sample JMX o/p is {code:java} "name" : "Hadoop:service=ResourceManager,name=PartitionQueueMetrics,partition=x" ...{code} In addition to these 2 changes, we would like to retain the Original QueueMetrics computation behaviour. Hope the above explanation explains why the below assert has been changed: {noformat} assertEquals(10 * GB, leafQueueA.getMetrics().getAvailableMB());{noformat} is changed to {noformat} assertEquals(22 * GB, leafQueueA.getMetrics().getAvailableMB());{noformat} This assert has been added as part of YARN-9596 to ensure YARN-6467 works correctly. YARN-9767 exhaustive unit test changes explain the difference between Partition * Queue Metrics, Partition Metrics and Original QueueMetrics very clearly. What changes this patch should contain? Yes, there is some confusion as some changes are in YARN-9767. #2 described in YARN-9767 should be in this patch. Otherwise, this patch is incomplete from feature rollout perspective. [~jhung] said #1 described in YARN-9767 was there even before. My understanding, it should be happening after YARN-6467 only. Would it be better if we handle that too here? What do you think? Please share your opinions. Post that, will post the proper patch first and then reviews can be taken up on the same. {quote}In order to be consistent with other API responses like {{/ws/v1/cluster/scheduler}}, I think this should just be an empty string. So, I would expect the JMX response to look like the following for DEFAULT_PARTITION
[jira] [Comment Edited] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103138#comment-17103138 ] Manikandan R edited comment on YARN-6492 at 5/9/20, 7:04 AM: - Thanks [~jhung] and [~epayne] for your support. I would like to clear some things at high level, especially on scope/requirements of this Jira and revisit the background (thought process) on how all these related Jira has been created to ensure that we are on same page: YARN-6467 computes metrics only for default partition. It has been created as an interim step towards the major goal of "Providing Metrics at Partition Level" to the customers. Major goal is nothing but this JIRA. Since YARN-6467 is stepping stone for this JIRA, it has been coded in a way that it should easily accomodate this Jira changes in a simplistic way (For example, Just removing if(partition is default) check inside each metric computation method expected to take care most of the things and no more changes required on collar side). Though YARN-6467 covers some aspects, it had created confusion (for the queue's associated with multiple partitions) as well. Original QueueMetrics computation behaviour has been changed. Original QueueMetrics computation is nothing but the metrics computation only from Queue perspective irrespective of how many partitions it has been associated to and nothing to do with Partitions. It started providing metrics only for "default" partition by replacing the original behaviour. Another reasons for taking up this path is, this Jira expected to go into the trunk immediately after YARN-6467 (as planned :) ) and hence there won't be any inconsistency in original queue metrics computation behaviour, but it didn't happened. So, whenever we said "backward compatibility" we referred to this Original QueueMetrics computation, not "existing QueueMetrics should still only contain metrics for default partition". In other words, Original QueueMetrics computation is nothing but the code/behaviour before YARN-6467. Now, let me explain scope of this JIRA. We would like to achieve the following things: # Partition * Queue Metrics: A partition can be associated with many queues. So we need to break up, hence we need Partition * Queue metrics. Proposed structure is PartitionMetric (labelX) QueueMetric (A) metrics Usermetrics QueueMetric (A1) metrics Usermetrics QueueMetric (A2) metrics Usermetrics QueueMetric (B) metrics Usermetrics PartitionMetric (labelY) QueueMetric (A) QueueMetric (A1) QueueMetric (A2) QueueMetric (B) … {{QueueMetrics#getPartitionQueueMetrics }} takes care of this registration into Metric system and use this object for all metric computations. Sample JMX o/p is {code:java} "name" : "Hadoop:service=ResourceManager,name=PartitionQueueMetrics,partition=x,q0=root,q1=a" ...{code} 2. Partition metrics: Partition level metrics computation. This can help Admins to analyse the usage at Partition level. Proposed structure is PartitionMetric (labelX) metrics PartitionMetric (labelY) metrics {{PartitionQueueMetrics#getPartitionQueueMetrics }} takes care of this registration into Metric system and use this object for all metric computations. Sample JMX o/p is {code:java} "name" : "Hadoop:service=ResourceManager,name=PartitionQueueMetrics,partition=x" ...{code} In addition to these 2 changes, we would like to retain the Original QueueMetrics computation behaviour. Hope the above explanation explains why the below assert has been changed: {noformat} assertEquals(10 * GB, leafQueueA.getMetrics().getAvailableMB());{noformat} is changed to {noformat} assertEquals(22 * GB, leafQueueA.getMetrics().getAvailableMB());{noformat} This assert has been added as part of YARN-9596 to ensure YARN-6467 works correctly. YARN-9767 exhaustive unit test changes explain the difference between Partition * Queue Metrics, Partition Metrics and Original QueueMetrics very clearly. What changes this patch should contain? Yes, there is some confusion as some changes are in YARN-9767. #2 described in YARN-9767 should be in this patch. Otherwise, this patch is incomplete from feature rollout perspective. [~jhung] said #1 described in YARN-9767 was there even before. My understanding, it should be happening after YARN-6467 only. Would it be better if we handle that too here? What do you think? Please share your opinions. Post that, will post the proper patch first and then reviews can be taken up on the same. \{quote}In order to be consistent with other API responses like {{/ws/v1/cluster/scheduler}}, I think this should just be an empty string. So, I would expect the JMX response to look like the following for DEFAULT_PARTIT
[jira] [Comment Edited] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103138#comment-17103138 ] Manikandan R edited comment on YARN-6492 at 5/9/20, 7:03 AM: - Thanks [~jhung] and [~epayne] for your support. I would like to clear some things at high level, especially on scope/requirements of this Jira and revisit the background (thought process) on how all these related Jira has been created to ensure that we are on same page: YARN-6467 computes metrics only for default partition. It has been created as an interim step towards the major goal of "Providing Metrics at Partition Level" to the customers. Major goal is nothing but this JIRA. Since YARN-6467 is stepping stone for this JIRA, it has been coded in a way that it should easily accomodate this Jira changes in a simplistic way (For example, Just removing if(partition is default) check inside each metric computation method expected to take care most of the things and no more changes required on collar side). Though YARN-6467 covers some aspects, it had created confusion (for the queue's associated with multiple partitions) as well. Original QueueMetrics computation behaviour has been changed. Original QueueMetrics computation is nothing but the metrics computation only from Queue perspective irrespective of how many partitions it has been associated to and nothing to do with Partitions. It started providing metrics only for "default" partition by replacing the original behaviour. Another reasons for taking up this path is, this Jira expected to go into the trunk immediately after YARN-6467 (as planned :) ) and hence there won't be any inconsistency in original queue metrics computation behaviour, but it didn't happened. So, whenever we said "backward compatibility" we referred to this Original QueueMetrics computation, not "existing QueueMetrics should still only contain metrics for default partition". In other words, Original QueueMetrics computation is nothing but the code/behaviour before YARN-6467. Now, let me explain scope of this JIRA. We would like to achieve the following things: # Partition * Queue Metrics: A partition can be associated with many queues. So we need to break up, hence we need Partition * Queue metrics. Proposed structure is PartitionMetric (labelX) QueueMetric (A) metrics Usermetrics QueueMetric (A1) metrics Usermetrics QueueMetric (A2) metrics Usermetrics QueueMetric (B) metrics Usermetrics PartitionMetric (labelY) QueueMetric (A) QueueMetric (A1) QueueMetric (A2) QueueMetric (B) … {{QueueMetrics#getPartitionQueueMetrics }} takes care of this registration into Metric system and use this object for all metric computations. Sample JMX o/p is {code:java} "name" : "Hadoop:service=ResourceManager,name=PartitionQueueMetrics,partition=x,q0=root,q1=a" ...{code} 2. Partition metrics: Partition level metrics computation. This can help Admins to analyse the usage at Partition level. Proposed structure is PartitionMetric (labelX) metrics PartitionMetric (labelY) metrics {{PartitionQueueMetrics#getPartitionQueueMetrics }} takes care of this registration into Metric system and use this object for all metric computations. Sample JMX o/p is {code:java} "name" : "Hadoop:service=ResourceManager,name=PartitionQueueMetrics,partition=x" ...{code} In addition to these 2 changes, we would like to retain the Original QueueMetrics computation behaviour. Hope the above explanation explains why the below assert has been changed: {noformat} assertEquals(10 * GB, leafQueueA.getMetrics().getAvailableMB());{noformat} is changed to {noformat} assertEquals(22 * GB, leafQueueA.getMetrics().getAvailableMB());{noformat} This assert has been added as part of YARN-9596 to ensure YARN-6467 works correctly. YARN-9767 exhaustive unit test changes explain the difference between Partition * Queue Metrics, Partition Metrics and Original QueueMetrics very clearly. What changes this patch should contain? Yes, there is some confusion as some changes are in YARN-9767. #2 described in YARN-9767 should be in this patch. Otherwise, this patch is incomplete from feature rollout perspective. [~jhung] said #1 described in YARN-9767 was there even before. My understanding, it should be happening after YARN-6467 only. Would it be better if we handle that too here? What do you think? Please share your opinions. Post that, will post the proper patch first and then reviews can be taken up on the same. In order to be consistent with other API responses like {{/ws/v1/cluster/scheduler}}, I think this should just be an empty string. So, I would expect the JMX response to look like the following for DEFAULT_PARTITION:\{
[jira] [Comment Edited] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103138#comment-17103138 ] Manikandan R edited comment on YARN-6492 at 5/9/20, 7:02 AM: - Thanks [~jhung] and [~epayne] for your support. I would like to clear some things at high level, especially on scope/requirements of this Jira and revisit the background (thought process) on how all these related Jira has been created to ensure that we are on same page: YARN-6467 computes metrics only for default partition. It has been created as an interim step towards the major goal of "Providing Metrics at Partition Level" to the customers. Major goal is nothing but this JIRA. Since YARN-6467 is stepping stone for this JIRA, it has been coded in a way that it should easily accomodate this Jira changes in a simplistic way (For example, Just removing if(partition is default) check inside each metric computation method expected to take care most of the things and no more changes required on collar side). Though YARN-6467 covers some aspects, it had created confusion (for the queue's associated with multiple partitions) as well. Original QueueMetrics computation behaviour has been changed. Original QueueMetrics computation is nothing but the metrics computation only from Queue perspective irrespective of how many partitions it has been associated to and nothing to do with Partitions. It started providing metrics only for "default" partition by replacing the original behaviour. Another reasons for taking up this path is, this Jira expected to go into the trunk immediately after YARN-6467 (as planned :) ) and hence there won't be any inconsistency in original queue metrics computation behaviour, but it didn't happened. So, whenever we said "backward compatibility" we referred to this Original QueueMetrics computation, not "existing QueueMetrics should still only contain metrics for default partition". In other words, Original QueueMetrics computation is nothing but the code/behaviour before YARN-6467. Now, let me explain scope of this JIRA. We would like to achieve the following things: # Partition * Queue Metrics: A partition can be associated with many queues. So we need to break up, hence we need Partition * Queue metrics. Proposed structure is PartitionMetric (labelX) QueueMetric (A) metrics Usermetrics QueueMetric (A1) metrics Usermetrics QueueMetric (A2) metrics Usermetrics QueueMetric (B) metrics Usermetrics PartitionMetric (labelY) QueueMetric (A) QueueMetric (A1) QueueMetric (A2) QueueMetric (B) … {{QueueMetrics#getPartitionQueueMetrics }} takes care of this registration into Metric system and use this object for all metric computations. Sample JMX o/p is {code:java} "name" : "Hadoop:service=ResourceManager,name=PartitionQueueMetrics,partition=x,q0=root,q1=a" ...{code} 2. Partition metrics: Partition level metrics computation. This can help Admins to analyse the usage at Partition level. Proposed structure is PartitionMetric (labelX) metrics PartitionMetric (labelY) metrics {{PartitionQueueMetrics#getPartitionQueueMetrics }} takes care of this registration into Metric system and use this object for all metric computations. Sample JMX o/p is {code:java} "name" : "Hadoop:service=ResourceManager,name=PartitionQueueMetrics,partition=x" ...{code} In addition to these 2 changes, we would like to retain the Original QueueMetrics computation behaviour. Hope the above explanation explains why the below assert has been changed: {noformat} assertEquals(10 * GB, leafQueueA.getMetrics().getAvailableMB());{noformat} is changed to {noformat} assertEquals(22 * GB, leafQueueA.getMetrics().getAvailableMB());{noformat} This assert has been added as part of YARN-9596 to ensure YARN-6467 works correctly. YARN-9767 exhaustive unit test changes explain the difference between Partition * Queue Metrics, Partition Metrics and Original QueueMetrics very clearly. What changes this patch should contain? Yes, there is some confusion as some changes are in YARN-9767. #2 described in YARN-9767 should be in this patch. Otherwise, this patch is incomplete from feature rollout perspective. [~jhung] said #1 described in YARN-9767 was there even before. My understanding, it should be happening after YARN-6467 only. Would it be better if we handle that too here? What do you think? Please share your opinions. Post that, will post the proper patch first and then reviews can be taken up on the same. In order to be consistent with other API responses like {{/ws/v1/cluster/scheduler}}, I think this should just be an empty string. So, I would expect the JMX response to look like the following for DEFAULT_PARTITION:\{quote} Yes [~epayne
[jira] [Commented] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103138#comment-17103138 ] Manikandan R commented on YARN-6492: Thanks [~jhung] and [~epayne] for your support. I would like to clear some things at high level, especially on scope/requirements of this Jira and revisit the background (thought process) on how all these related Jira has been created to ensure that we are on same page: YARN-6467 computes metrics only for default partition. It has been created as an interim step towards the major goal of "Providing Metrics at Partition Level" to the customers. Major goal is nothing but this JIRA. Since YARN-6467 is stepping stone for this JIRA, it has been coded in a way that it should easily accomodate this Jira changes in a simplistic way (For example, Just removing if(partition is default) check inside each metric computation method expected to take care most of the things and no more changes required on collar side). Though YARN-6467 covers some aspects, it had created confusion (for the queue's associated with multiple partitions) as well. Original QueueMetrics computation behaviour has been changed. Original QueueMetrics computation is nothing but the metrics computation only from Queue perspective irrespective of how many partitions it has been associated to and nothing to do with Partitions. It started providing metrics only for "default" partition by replacing the original behaviour. Another reasons for taking up this path is, this Jira expected to go into the trunk immediately after YARN-6467 (as planned :) ) and hence there won't be any inconsistency in original queue metrics computation behaviour, but it didn't happened. So, whenever we said "backward compatibility" we referred to this Original QueueMetrics computation, not "existing QueueMetrics should still only contain metrics for default partition". In other words, Original QueueMetrics computation is nothing but the code/behaviour before YARN-6467. Now, let me explain scope of this JIRA. We would like to achieve the following things: # Partition * Queue Metrics: A partition can be associated with many queues. So we need to break up, hence we need Partition * Queue metrics. Proposed structure is PartitionMetric (labelX) QueueMetric (A) metrics Usermetrics QueueMetric (A1) metrics Usermetrics QueueMetric (A2) metrics Usermetrics QueueMetric (B) metrics Usermetrics PartitionMetric (labelY) QueueMetric (A) QueueMetric (A1) QueueMetric (A2) QueueMetric (B) … {{QueueMetrics#getPartitionQueueMetrics }} takes care of this registration into Metric system and use this object for all metric computations. Sample JMX o/p is {code:java} "name" : "Hadoop:service=ResourceManager,name=PartitionQueueMetrics,partition=x,q0=root,q1=a" ...{code} 2. Partition metrics: Partition level metrics computation. This can help Admins to analyse the usage at Partition level. Proposed structure is PartitionMetric (labelX) metrics PartitionMetric (labelY) metrics {{PartitionQueueMetrics#getPartitionQueueMetrics }} takes care of this registration into Metric system and use this object for all metric computations. Sample JMX o/p is {code:java} "name" : "Hadoop:service=ResourceManager,name=PartitionQueueMetrics,partition=x" ...{code} In addition to these 2 changes, we would like to retain the Original QueueMetrics computation behaviour. Hope the above explanation explains why the below assert has been changed: {noformat} assertEquals(10 * GB, leafQueueA.getMetrics().getAvailableMB());{noformat} is changed to {noformat} assertEquals(22 * GB, leafQueueA.getMetrics().getAvailableMB());{noformat} This assert has been added as part of YARN-9596 to ensure YARN-6467 works correctly. YARN-9767 exhaustive unit test changes explain the difference between Partition * Queue Metrics, Partition Metrics and Original QueueMetrics very clearly. What changes this patch should contain? Yes, there is some confusion as some changes are in YARN-9767. #2 described in YARN-9767 should be in this patch. Otherwise, this patch is incomplete from feature rollout perspective. [~jhung] said #1 described in YARN-9767 was there even before. My understanding, it should be happening after YARN-6467 only. Would it be better if we handle that too here? What do you think? Please share your opinions. Post that, will post the proper patch first and then reviews can be taken up on the same.
[jira] [Updated] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-6492: --- Attachment: YARN-6492.008.WIP.patch > Generate queue metrics for each partition > - > > Key: YARN-6492 > URL: https://issues.apache.org/jira/browse/YARN-6492 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Jonathan Hung >Assignee: Manikandan R >Priority: Major > Attachments: PartitionQueueMetrics_default_partition.txt, > PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, > YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, > YARN-6492.004.patch, YARN-6492.005.WIP.patch, YARN-6492.006.WIP.patch, > YARN-6492.007.WIP.patch, YARN-6492.008.WIP.patch, partition_metrics.txt > > > We are interested in having queue metrics for all partitions. Right now each > queue has one QueueMetrics object which captures metrics either in default > partition or across all partitions. (After YARN-6467 it will be in default > partition) > But having the partition metrics would be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17101814#comment-17101814 ] Manikandan R commented on YARN-6492: 1: We had two different method names till .006.patch. Based on our later discussions, simplified the code in .007.patch based on the thinking getPartitionQueueMetrics in QueueMetrics meant for Partition * Queue metric computation, whereas, getPartitionQueueMetrics in PartitionQueueMetrics meant for Partition metric computation. Since PartitionQueueMetrics is an extension of QueueMetrics, we had same method names and functionality has been overridden for each of its need. However, comments are not clear. Modified the comments in .008.patch. 2,3: This Jira and YARN-9767 has to go in same time. Otherwise, as a feature, it won't be complete as such. It is just that we are handling the issues separately in YARN-9767 for better code review. YARN-9767 answers most of the concerns raised in #2, #3. For example, YARN-9767 contains very exhaustive test asserts which can clear up lot of confusions and enhance our understanding. If it is confusing to have two different patches, we will need to decide on how to take this further. If you think, both Jira should be taken up in same patch for code completeness perspective, then let's do that. We can follow any one of the path which is convenient for us. Thoughts? 4: Yes, We can do. > Generate queue metrics for each partition > - > > Key: YARN-6492 > URL: https://issues.apache.org/jira/browse/YARN-6492 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Jonathan Hung >Assignee: Manikandan R >Priority: Major > Attachments: PartitionQueueMetrics_default_partition.txt, > PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, > YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, > YARN-6492.004.patch, YARN-6492.005.WIP.patch, YARN-6492.006.WIP.patch, > YARN-6492.007.WIP.patch, partition_metrics.txt > > > We are interested in having queue metrics for all partitions. Right now each > queue has one QueueMetrics object which captures metrics either in default > partition or across all partitions. (After YARN-6467 it will be in default > partition) > But having the partition metrics would be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10154) CS Dynamic Queues cannot be configured with absolute resources
[ https://issues.apache.org/jira/browse/YARN-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091697#comment-17091697 ] Manikandan R commented on YARN-10154: - Thanks [~prabhujoseph] and [~sunilg] for testing and fix. > CS Dynamic Queues cannot be configured with absolute resources > -- > > Key: YARN-10154 > URL: https://issues.apache.org/jira/browse/YARN-10154 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.3 >Reporter: Sunil G >Assignee: Manikandan R >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-10154.001.patch, YARN-10154.002.patch, > YARN-10154.003.patch, YARN-10154.addendum-001.patch > > > In CS, ManagedParent Queue and its template cannot take absolute resource > value like > [memory=8192,vcores=8] > Thsi Jira is to track and improve the configuration reading module of > DynamicQueue to support absolute resource values. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10049) FIFOOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091692#comment-17091692 ] Manikandan R commented on YARN-10049: - Can we trigger? > FIFOOrderingPolicy Improvements > --- > > Key: YARN-10049 > URL: https://issues.apache.org/jira/browse/YARN-10049 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10049.001.patch, YARN-10049.002.patch, > YARN-10049.003.patch > > > FIFOPolicy of FS does the following comparisons in addition to app priority > comparison: > 1. Using Start time > 2. Using Name > Scope of this jira is to achieve the same comparisons in FIFOOrderingPolicy > of CS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10239) Capacity is zero for auto created leaf queues after leaf-queue-template.capacity has been updated
[ https://issues.apache.org/jira/browse/YARN-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091690#comment-17091690 ] Manikandan R commented on YARN-10239: - Can we close this as issues has been taken care in https://issues.apache.org/jira/browse/YARN-10154?focusedCommentId=17089904&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17089904? > Capacity is zero for auto created leaf queues after > leaf-queue-template.capacity has been updated > - > > Key: YARN-10239 > URL: https://issues.apache.org/jira/browse/YARN-10239 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Akhil PB >Assignee: Manikandan R >Priority: Major > > In the scheduler response, the capacity of the auto created leaf queue became > zero after leaf-queue-template.capacity of managed parent queue has been > updated. > cc: [~sunilg] [~wangda] [~prabhujoseph] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10239) Capacity is zero for auto created leaf queues after leaf-queue-template.capacity has been updated
[ https://issues.apache.org/jira/browse/YARN-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R reassigned YARN-10239: --- Assignee: Manikandan R > Capacity is zero for auto created leaf queues after > leaf-queue-template.capacity has been updated > - > > Key: YARN-10239 > URL: https://issues.apache.org/jira/browse/YARN-10239 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Akhil PB >Assignee: Manikandan R >Priority: Major > > In the scheduler response, the capacity of the auto created leaf queue became > zero after leaf-queue-template.capacity of managed parent queue has been > updated. > cc: [~sunilg] [~wangda] [~prabhujoseph] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10049) FIFOOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084970#comment-17084970 ] Manikandan R commented on YARN-10049: - [~pbacsko] [~snemeth] I don't think Junit failure is related to this patch. However, It is better to trigger jenkins as earlier results are throwing 404 error. Can we take it forward? > FIFOOrderingPolicy Improvements > --- > > Key: YARN-10049 > URL: https://issues.apache.org/jira/browse/YARN-10049 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10049.001.patch, YARN-10049.002.patch, > YARN-10049.003.patch > > > FIFOPolicy of FS does the following comparisons in addition to app priority > comparison: > 1. Using Start time > 2. Using Name > Scope of this jira is to achieve the same comparisons in FIFOOrderingPolicy > of CS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10154) CS Dynamic Queues cannot be configured with absolute resources
[ https://issues.apache.org/jira/browse/YARN-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-10154: Attachment: YARN-10154.003.patch > CS Dynamic Queues cannot be configured with absolute resources > -- > > Key: YARN-10154 > URL: https://issues.apache.org/jira/browse/YARN-10154 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.3 >Reporter: Sunil G >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10154.001.patch, YARN-10154.002.patch, > YARN-10154.003.patch > > > In CS, ManagedParent Queue and its template cannot take absolute resource > value like > [memory=8192,vcores=8] > Thsi Jira is to track and improve the configuration reading module of > DynamicQueue to support absolute resource values. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10154) CS Dynamic Queues cannot be configured with absolute resources
[ https://issues.apache.org/jira/browse/YARN-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17081378#comment-17081378 ] Manikandan R commented on YARN-10154: - Sorry for the delay. Attaching .003.patch covering suggested above test cases. > CS Dynamic Queues cannot be configured with absolute resources > -- > > Key: YARN-10154 > URL: https://issues.apache.org/jira/browse/YARN-10154 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.3 >Reporter: Sunil G >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10154.001.patch, YARN-10154.002.patch > > > In CS, ManagedParent Queue and its template cannot take absolute resource > value like > [memory=8192,vcores=8] > Thsi Jira is to track and improve the configuration reading module of > DynamicQueue to support absolute resource values. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10049) FIFOOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-10049: Attachment: YARN-10049.003.patch > FIFOOrderingPolicy Improvements > --- > > Key: YARN-10049 > URL: https://issues.apache.org/jira/browse/YARN-10049 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10049.001.patch, YARN-10049.002.patch, > YARN-10049.003.patch > > > FIFOPolicy of FS does the following comparisons in addition to app priority > comparison: > 1. Using Start time > 2. Using Name > Scope of this jira is to achieve the same comparisons in FIFOOrderingPolicy > of CS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10049) FIFOOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17069251#comment-17069251 ] Manikandan R commented on YARN-10049: - [~pbacsko] Thanks. Attached .003.patch. > FIFOOrderingPolicy Improvements > --- > > Key: YARN-10049 > URL: https://issues.apache.org/jira/browse/YARN-10049 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10049.001.patch, YARN-10049.002.patch > > > FIFOPolicy of FS does the following comparisons in addition to app priority > comparison: > 1. Using Start time > 2. Using Name > Scope of this jira is to achieve the same comparisons in FIFOOrderingPolicy > of CS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10049) FIFOOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17067846#comment-17067846 ] Manikandan R commented on YARN-10049: - Rebasing patch with minor changes in junits. [~pbacsko] [~snemeth] Can you please take a look? > FIFOOrderingPolicy Improvements > --- > > Key: YARN-10049 > URL: https://issues.apache.org/jira/browse/YARN-10049 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10049.001.patch, YARN-10049.002.patch > > > FIFOPolicy of FS does the following comparisons in addition to app priority > comparison: > 1. Using Start time > 2. Using Name > Scope of this jira is to achieve the same comparisons in FIFOOrderingPolicy > of CS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10049) FIFOOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-10049: Attachment: YARN-10049.002.patch > FIFOOrderingPolicy Improvements > --- > > Key: YARN-10049 > URL: https://issues.apache.org/jira/browse/YARN-10049 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10049.001.patch, YARN-10049.002.patch > > > FIFOPolicy of FS does the following comparisons in addition to app priority > comparison: > 1. Using Start time > 2. Using Name > Scope of this jira is to achieve the same comparisons in FIFOOrderingPolicy > of CS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17067836#comment-17067836 ] Manikandan R commented on YARN-10043: - [~snemeth] [~pbacsko] Thanks for your reviews and commit. > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-10043.001.patch, YARN-10043.002.patch, > YARN-10043.003.patch, YARN-10043.004.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066864#comment-17066864 ] Manikandan R commented on YARN-10043: - [~snemeth] I am waiting on this. Can we please take it forward? > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch, YARN-10043.002.patch, > YARN-10043.003.patch, YARN-10043.004.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10154) CS Dynamic Queues cannot be configured with absolute resources
[ https://issues.apache.org/jira/browse/YARN-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066859#comment-17066859 ] Manikandan R commented on YARN-10154: - [~sunilg] Had a chance to review the patch? Thank you. > CS Dynamic Queues cannot be configured with absolute resources > -- > > Key: YARN-10154 > URL: https://issues.apache.org/jira/browse/YARN-10154 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.3 >Reporter: Sunil G >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10154.001.patch, YARN-10154.002.patch > > > In CS, ManagedParent Queue and its template cannot take absolute resource > value like > [memory=8192,vcores=8] > Thsi Jira is to track and improve the configuration reading module of > DynamicQueue to support absolute resource values. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10198) [managedParent].%primary_group mapping rule doesn't work after YARN-9868
[ https://issues.apache.org/jira/browse/YARN-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063380#comment-17063380 ] Manikandan R commented on YARN-10198: - Should we do NULL check for {{parent}} after Line No. 163 to address mappings like u:%user:%primary_group ? Other than this, Looks good to me. Thanks [~pbacsko] and [~prabhujoseph]. > [managedParent].%primary_group mapping rule doesn't work after YARN-9868 > > > Key: YARN-10198 > URL: https://issues.apache.org/jira/browse/YARN-10198 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10198-001.patch, YARN-10198-002.patch, > YARN-10198-003.patch > > > YARN-9868 introduced an unnecessary check if we have the following placement > rule: > [managedParentQueue].%primary_group > Here, {{%primary_group}} is expected to be created if it doesn't exist. > However, there is this validation code which is not necessary: > {noformat} > } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) { > if (this.queueManager > .getQueue(groups.getGroups(user).get(0)) != null) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } else { > return null; > } > {noformat} > We should revert this part to the original version: > {noformat} > } else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10198) [managedParent].%primary_group mapping rule doesn't work after YARN-9868
[ https://issues.apache.org/jira/browse/YARN-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17062750#comment-17062750 ] Manikandan R commented on YARN-10198: - [~pbacsko] Thanks for extending the patch. 1. Yes, Unlike other placement rules in FS, "SecondaryGroupExistingPlacementRule" expects Queue exist. Except "SecondaryGroupExistingPlacementRule", all other rules (even PrimaryGroupExistingPlacementRule) depends on "FSPlacementRule#createQueue" flag as well in conjunction with FSPlacementRule#configuredQueue. Modified patch is in line with above FS flow. Line No.161 can make use of getPrimaryGroup() method. > [managedParent].%primary_group mapping rule doesn't work after YARN-9868 > > > Key: YARN-10198 > URL: https://issues.apache.org/jira/browse/YARN-10198 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10198-001.patch, YARN-10198-002.patch > > > YARN-9868 introduced an unnecessary check if we have the following placement > rule: > [managedParentQueue].%primary_group > Here, {{%primary_group}} is expected to be created if it doesn't exist. > However, there is this validation code which is not necessary: > {noformat} > } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) { > if (this.queueManager > .getQueue(groups.getGroups(user).get(0)) != null) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } else { > return null; > } > {noformat} > We should revert this part to the original version: > {noformat} > } else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10200) Add number of containers to RMAppManager summary
[ https://issues.apache.org/jira/browse/YARN-10200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17061053#comment-17061053 ] Manikandan R commented on YARN-10200: - Should we sum up all attempt's totalallocatedcontainers (from RMAppAttemptMetrics) and add it to this summary? > Add number of containers to RMAppManager summary > > > Key: YARN-10200 > URL: https://issues.apache.org/jira/browse/YARN-10200 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Hung >Priority: Major > > It would be useful to persist this so we can track containers processed by RM. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10198) [managedParent].%primary_group mapping rule doesn't work after YARN-9868
[ https://issues.apache.org/jira/browse/YARN-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17061033#comment-17061033 ] Manikandan R commented on YARN-10198: - [~pbacsko] Thanks for the patch. 1. Are we planning to have this check only for regular parent queue in separate jira? 2. Are we going to handle [managedParentQueue].%secondary_group in separate jira? > [managedParent].%primary_group mapping rule doesn't work after YARN-9868 > > > Key: YARN-10198 > URL: https://issues.apache.org/jira/browse/YARN-10198 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10198-001.patch > > > YARN-9868 introduced an unnecessary check if we have the following placement > rule: > [managedParentQueue].%primary_group > Here, {{%primary_group}} is expected to be created if it doesn't exist. > However, there is this validation code which is not necessary: > {noformat} > } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) { > if (this.queueManager > .getQueue(groups.getGroups(user).get(0)) != null) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } else { > return null; > } > {noformat} > We should revert this part to the original version: > {noformat} > } else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10154) CS Dynamic Queues cannot be configured with absolute resources
[ https://issues.apache.org/jira/browse/YARN-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17059200#comment-17059200 ] Manikandan R edited comment on YARN-10154 at 3/14/20, 8:34 AM: --- Yes, patch takes care of min and max in above discussed format only. Updated patch for doc changes. was (Author: maniraj...@gmail.com): Yes, patch takes care of min and max in above discussed format only. Once [~sunilg] reviewed the patch, will update doc once for all with review comments (if any). > CS Dynamic Queues cannot be configured with absolute resources > -- > > Key: YARN-10154 > URL: https://issues.apache.org/jira/browse/YARN-10154 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.3 >Reporter: Sunil G >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10154.001.patch, YARN-10154.002.patch > > > In CS, ManagedParent Queue and its template cannot take absolute resource > value like > [memory=8192,vcores=8] > Thsi Jira is to track and improve the configuration reading module of > DynamicQueue to support absolute resource values. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10154) CS Dynamic Queues cannot be configured with absolute resources
[ https://issues.apache.org/jira/browse/YARN-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-10154: Attachment: YARN-10154.002.patch > CS Dynamic Queues cannot be configured with absolute resources > -- > > Key: YARN-10154 > URL: https://issues.apache.org/jira/browse/YARN-10154 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.3 >Reporter: Sunil G >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10154.001.patch, YARN-10154.002.patch > > > In CS, ManagedParent Queue and its template cannot take absolute resource > value like > [memory=8192,vcores=8] > Thsi Jira is to track and improve the configuration reading module of > DynamicQueue to support absolute resource values. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10154) CS Dynamic Queues cannot be configured with absolute resources
[ https://issues.apache.org/jira/browse/YARN-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17059200#comment-17059200 ] Manikandan R commented on YARN-10154: - Yes, patch takes care of min and max in above discussed format only. Once [~sunilg] reviewed the patch, will update doc once for all with review comments (if any). > CS Dynamic Queues cannot be configured with absolute resources > -- > > Key: YARN-10154 > URL: https://issues.apache.org/jira/browse/YARN-10154 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.3 >Reporter: Sunil G >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10154.001.patch > > > In CS, ManagedParent Queue and its template cannot take absolute resource > value like > [memory=8192,vcores=8] > Thsi Jira is to track and improve the configuration reading module of > DynamicQueue to support absolute resource values. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10198) [managedParent].%primary_group mapping rule doesn't work after YARN-9868
[ https://issues.apache.org/jira/browse/YARN-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17059196#comment-17059196 ] Manikandan R commented on YARN-10198: - Above mentioned NULL check has been added to be in line with YARN-9840 changes. With above check in place, client side thinks there is no queue mapping rule as ApplicationPlacementContext is NULL. Without above checks, ApplicationPlacementContext is not null but queue name is NULL. In both cases, "default" queue would be used. Should we do this checks only for regular parent queue? > [managedParent].%primary_group mapping rule doesn't work after YARN-9868 > > > Key: YARN-10198 > URL: https://issues.apache.org/jira/browse/YARN-10198 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > > YARN-9868 introduced an unnecessary check if we have the following placement > rule: > [managedParentQueue].%primary_group > Here, {{%primary_group}} is expected to be created if it doesn't exist. > However, there is this validation code which is not necessary: > {noformat} > } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) { > if (this.queueManager > .getQueue(groups.getGroups(user).get(0)) != null) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } else { > return null; > } > {noformat} > We should revert this part to the original version: > {noformat} > } else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10154) CS Dynamic Queues cannot be configured with absolute resources
[ https://issues.apache.org/jira/browse/YARN-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055304#comment-17055304 ] Manikandan R commented on YARN-10154: - [~sunilg] Can you please do a quick check on this? > CS Dynamic Queues cannot be configured with absolute resources > -- > > Key: YARN-10154 > URL: https://issues.apache.org/jira/browse/YARN-10154 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.3 >Reporter: Sunil G >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10154.001.patch > > > In CS, ManagedParent Queue and its template cannot take absolute resource > value like > [memory=8192,vcores=8] > Thsi Jira is to track and improve the configuration reading module of > DynamicQueue to support absolute resource values. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055301#comment-17055301 ] Manikandan R commented on YARN-10043: - [~snemeth] Can you please check? > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch, YARN-10043.002.patch, > YARN-10043.003.patch, YARN-10043.004.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9831) NMTokenSecretManagerInRM#createNMToken blocks ApplicationMasterService allocate flow
[ https://issues.apache.org/jira/browse/YARN-9831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17048639#comment-17048639 ] Manikandan R commented on YARN-9831: In latest patch, I think {{!nodeSet.contains(container.getNodeId())}} block and its {{nodeSet.add(container.getNodeId());}} should be in critical section. Otherwise, it may lead to inconsistent data while adding (WRITE operation) new tokens into set as read op is followed by write op and lock is not write anymore. In general, I think we can explore if computeIfPresent / computeIfAbsent / Compute methods of concurrent hash map can be used to perform operations like add, remove etc atomically in this context to avoid write locks. > NMTokenSecretManagerInRM#createNMToken blocks ApplicationMasterService > allocate flow > > > Key: YARN-9831 > URL: https://issues.apache.org/jira/browse/YARN-9831 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bibin Chundatt >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-9831.001.patch, YARN-9831.002.patch > > > Currently attempt's NMToken cannot be generated independently. > Each attempts allocate flow blocks each other. We should improve the same -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9831) NMTokenSecretManagerInRM#createNMToken blocks ApplicationMasterService allocate flow
[ https://issues.apache.org/jira/browse/YARN-9831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17046840#comment-17046840 ] Manikandan R commented on YARN-9831: [~BilwaST] Thanks for the patch. Had a quick glance. Locks has been changed from "write" to "read" in {{createAndGetNMToken}} method assuming there shouldn't be any issues while adding Node Id's into Set ( nodeSet.add(container.getNodeId()); ) because it has been created ConcurrentHashMap.newKeySet(). If this is true, Should we apply the same principle to other places where in Node gets removed ( removeNodeKey() ) ? > NMTokenSecretManagerInRM#createNMToken blocks ApplicationMasterService > allocate flow > > > Key: YARN-9831 > URL: https://issues.apache.org/jira/browse/YARN-9831 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bibin Chundatt >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-9831.001.patch, YARN-9831.002.patch > > > Currently attempt's NMToken cannot be generated independently. > Each attempts allocate flow blocks each other. We should improve the same -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10155) TestDelegationTokenRenewer.testTokenThreadTimeout fails in trunk
[ https://issues.apache.org/jira/browse/YARN-10155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17046758#comment-17046758 ] Manikandan R commented on YARN-10155: - Ok. However, we can get this patch in as it fixes the exception and reduces the waiting time. Post that, we can see the behaviour in Jenkins for some time and then close the Jira based on the results. Thoughts? > TestDelegationTokenRenewer.testTokenThreadTimeout fails in trunk > > > Key: YARN-10155 > URL: https://issues.apache.org/jira/browse/YARN-10155 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.3.0 >Reporter: Adam Antal >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10155.001.patch, testTokenThreadTimeout.txt, > testTokenThreadTimeout_with_patch.txt > > > The TestDelegationTokenRenewer.testTokenThreadTimeout test committed in > YARN-9768 often fails with timeout. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17045710#comment-17045710 ] Manikandan R commented on YARN-10043: - [~snemeth] Can you please take a final look to commit the code? > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch, YARN-10043.002.patch, > YARN-10043.003.patch, YARN-10043.004.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10155) TestDelegationTokenRenewer.testTokenThreadTimeout fails in trunk
[ https://issues.apache.org/jira/browse/YARN-10155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17045709#comment-17045709 ] Manikandan R commented on YARN-10155: - [~adam.antal] Thanks for your effort. I've been running all my tests through mvn cli with prior clean and build. Can you please try the alternative step? > TestDelegationTokenRenewer.testTokenThreadTimeout fails in trunk > > > Key: YARN-10155 > URL: https://issues.apache.org/jira/browse/YARN-10155 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.3.0 >Reporter: Adam Antal >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10155.001.patch, testTokenThreadTimeout.txt, > testTokenThreadTimeout_with_patch.txt > > > The TestDelegationTokenRenewer.testTokenThreadTimeout test committed in > YARN-9768 often fails with timeout. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9893) Capacity scheduler: enhance leaf-queue-template capacity / maximum-capacity setting
[ https://issues.apache.org/jira/browse/YARN-9893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17045707#comment-17045707 ] Manikandan R commented on YARN-9893: [~pbacsko] Part of this requirement would be addressed in YARN-10154. We can handle support for two percentage values in separate jira. Is that ok? > Capacity scheduler: enhance leaf-queue-template capacity / maximum-capacity > setting > --- > > Key: YARN-9893 > URL: https://issues.apache.org/jira/browse/YARN-9893 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Peter Bacsko >Assignee: Manikandan R >Priority: Major > > Capacity Scheduler does not support two percentage values for leaf queue > capacity and maximum-capacity settings. So, you can't do something like this: > {{yarn.scheduler.capacity.root.users.john.leaf-queue-template.capacity=memory-mb=50.0%, > vcores=50.0%}} > On top of that, it's not even possible to define absolute resources: > {{yarn.scheduler.capacity.root.users.john.leaf-queue-template.capacity=memory-mb=16384, > vcores=8}} > Only a single percentage value is accepted. > This makes it nearly impossible to properly convert a similar setting from > Fair Scheduler, where such a configuration is valid and accepted > ({{}}). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10154) CS Dynamic Queues cannot be configured with absolute resources
[ https://issues.apache.org/jira/browse/YARN-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R reassigned YARN-10154: --- Assignee: Manikandan R (was: Sunil G) > CS Dynamic Queues cannot be configured with absolute resources > -- > > Key: YARN-10154 > URL: https://issues.apache.org/jira/browse/YARN-10154 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.3 >Reporter: Sunil G >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10154.001.patch > > > In CS, ManagedParent Queue and its template cannot take absolute resource > value like > [memory=8192,vcores=8] > Thsi Jira is to track and improve the configuration reading module of > DynamicQueue to support absolute resource values. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-9893) Capacity scheduler: enhance leaf-queue-template capacity / maximum-capacity setting
[ https://issues.apache.org/jira/browse/YARN-9893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R reassigned YARN-9893: -- Assignee: Manikandan R (was: Peter Bacsko) > Capacity scheduler: enhance leaf-queue-template capacity / maximum-capacity > setting > --- > > Key: YARN-9893 > URL: https://issues.apache.org/jira/browse/YARN-9893 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Peter Bacsko >Assignee: Manikandan R >Priority: Major > > Capacity Scheduler does not support two percentage values for leaf queue > capacity and maximum-capacity settings. So, you can't do something like this: > {{yarn.scheduler.capacity.root.users.john.leaf-queue-template.capacity=memory-mb=50.0%, > vcores=50.0%}} > On top of that, it's not even possible to define absolute resources: > {{yarn.scheduler.capacity.root.users.john.leaf-queue-template.capacity=memory-mb=16384, > vcores=8}} > Only a single percentage value is accepted. > This makes it nearly impossible to properly convert a similar setting from > Fair Scheduler, where such a configuration is valid and accepted > ({{}}). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10154) CS Dynamic Queues cannot be configured with absolute resources
[ https://issues.apache.org/jira/browse/YARN-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17045698#comment-17045698 ] Manikandan R commented on YARN-10154: - Attaching WIP patch with simple test case to validate approach is good enough to proceed further. > CS Dynamic Queues cannot be configured with absolute resources > -- > > Key: YARN-10154 > URL: https://issues.apache.org/jira/browse/YARN-10154 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.3 >Reporter: Sunil G >Assignee: Sunil G >Priority: Major > Attachments: YARN-10154.001.patch > > > In CS, ManagedParent Queue and its template cannot take absolute resource > value like > [memory=8192,vcores=8] > Thsi Jira is to track and improve the configuration reading module of > DynamicQueue to support absolute resource values. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10154) CS Dynamic Queues cannot be configured with absolute resources
[ https://issues.apache.org/jira/browse/YARN-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-10154: Attachment: YARN-10154.001.patch > CS Dynamic Queues cannot be configured with absolute resources > -- > > Key: YARN-10154 > URL: https://issues.apache.org/jira/browse/YARN-10154 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.3 >Reporter: Sunil G >Assignee: Sunil G >Priority: Major > Attachments: YARN-10154.001.patch > > > In CS, ManagedParent Queue and its template cannot take absolute resource > value like > [memory=8192,vcores=8] > Thsi Jira is to track and improve the configuration reading module of > DynamicQueue to support absolute resource values. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10155) TestDelegationTokenRenewer.testTokenThreadTimeout fails in trunk
[ https://issues.apache.org/jira/browse/YARN-10155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-10155: Attachment: YARN-10155.001.patch > TestDelegationTokenRenewer.testTokenThreadTimeout fails in trunk > > > Key: YARN-10155 > URL: https://issues.apache.org/jira/browse/YARN-10155 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.3.0 >Reporter: Adam Antal >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10155.001.patch > > > The TestDelegationTokenRenewer.testTokenThreadTimeout test committed in > YARN-9768 often fails with timeout. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10155) TestDelegationTokenRenewer.testTokenThreadTimeout fails in trunk
[ https://issues.apache.org/jira/browse/YARN-10155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17043538#comment-17043538 ] Manikandan R commented on YARN-10155: - [~inigoiri] I was not able to reproduce on my machine even after multiple runs. However, as a first step, I fixed the ArithmeticException as mentioned in https://issues.apache.org/jira/browse/YARN-9768?focusedCommentId=17022152&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17022152 assuming that this would avoid huge amount of unnecessary I/O into logs and also to get us into cleaner situation wherein we can focus on the actual problem. Also, reduced the timeout values to avoid waiting unnecessarily. Please take a look. > TestDelegationTokenRenewer.testTokenThreadTimeout fails in trunk > > > Key: YARN-10155 > URL: https://issues.apache.org/jira/browse/YARN-10155 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.3.0 >Reporter: Adam Antal >Assignee: Manikandan R >Priority: Major > > The TestDelegationTokenRenewer.testTokenThreadTimeout test committed in > YARN-9768 often fails with timeout. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10154) CS Dynamic Queues cannot be configured with absolute resources
[ https://issues.apache.org/jira/browse/YARN-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17042026#comment-17042026 ] Manikandan R commented on YARN-10154: - [~sunilg] Can I take this forward? > CS Dynamic Queues cannot be configured with absolute resources > -- > > Key: YARN-10154 > URL: https://issues.apache.org/jira/browse/YARN-10154 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.3 >Reporter: Sunil G >Assignee: Sunil G >Priority: Major > > In CS, ManagedParent Queue and its template cannot take absolute resource > value like > [memory=8192,vcores=8] > Thsi Jira is to track and improve the configuration reading module of > DynamicQueue to support absolute resource values. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10155) TestDelegationTokenRenewer.testTokenThreadTimeout fails in trunk
[ https://issues.apache.org/jira/browse/YARN-10155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R reassigned YARN-10155: --- Assignee: Manikandan R > TestDelegationTokenRenewer.testTokenThreadTimeout fails in trunk > > > Key: YARN-10155 > URL: https://issues.apache.org/jira/browse/YARN-10155 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.3.0 >Reporter: Adam Antal >Assignee: Manikandan R >Priority: Major > > The TestDelegationTokenRenewer.testTokenThreadTimeout test committed in > YARN-9768 often fails with timeout. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17040219#comment-17040219 ] Manikandan R commented on YARN-10043: - [~snemeth] Can you review the latest patch? > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch, YARN-10043.002.patch, > YARN-10043.003.patch, YARN-10043.004.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17037902#comment-17037902 ] Manikandan R commented on YARN-6492: [~jhung] and [~epayne] were doing the code reviews. Pinging them..Will follow up and move forward. > Generate queue metrics for each partition > - > > Key: YARN-6492 > URL: https://issues.apache.org/jira/browse/YARN-6492 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Jonathan Hung >Assignee: Manikandan R >Priority: Major > Attachments: PartitionQueueMetrics_default_partition.txt, > PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, > YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, > YARN-6492.004.patch, YARN-6492.005.WIP.patch, YARN-6492.006.WIP.patch, > YARN-6492.007.WIP.patch, partition_metrics.txt > > > We are interested in having queue metrics for all partitions. Right now each > queue has one QueueMetrics object which captures metrics either in default > partition or across all partitions. (After YARN-6467 it will be in default > partition) > But having the partition metrics would be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-10043: Attachment: YARN-10043.004.patch > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch, YARN-10043.002.patch, > YARN-10043.003.patch, YARN-10043.004.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17036349#comment-17036349 ] Manikandan R commented on YARN-10043: - Thanks [~snemeth]. Addressed all comments. {quote}You can remove the type argument to FairOrderingPolicy. {quote} Are you suggesting FairOrderingPolicy policy = new FairOrderingPolicy(); ? If so, we will need to do suppress warnings etc. > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch, YARN-10043.002.patch, > YARN-10043.003.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17033715#comment-17033715 ] Manikandan R commented on YARN-10043: - [~pbacsko] Can you please take a look? YARN-10049 is also dependent on this. > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch, YARN-10043.002.patch, > YARN-10043.003.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17028478#comment-17028478 ] Manikandan R edited comment on YARN-10043 at 2/2/20 4:47 PM: - [~pbacsko] Thanks for your reviews. # Taken care. # Compare demands ensures entity without resource demand get lower priority than ones who have demands. When both entities has certain demands ( > 0), then there is no actual comparison. Hence the changes are in that way. It has been commented too at class level. Similar to {{FairSharePolicy}} implementation. # For #3 and #4, yes there are multiple asserts intended to clearly show the expected final ones are passing after asserting all earlier comparison (in the precedence order) has been passed as well. However, I see your point and made the changes by balancing both the views. was (Author: maniraj...@gmail.com): [~pbacsko] Thanks for your reviews. # Taken care. # Compare demands ensures entity without resource demand get lower priority than ones who have demands. When both entities has certain demands ( > 0), then there is no actual comparison. Hence the changes are in that way. It has been commented too at class level. Similar to {{FairSharePolicy}} implementation. # For #3 and #4, yes there are multiple asserts intended to clearly show the expected final ones are passing after asserting all earlier comparison (in the precedence order) has been passed as well. However, I see your point and made the changes by balancing both the views. > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch, YARN-10043.002.patch, > YARN-10043.003.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-10043: Attachment: YARN-10043.003.patch > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch, YARN-10043.002.patch, > YARN-10043.003.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17028478#comment-17028478 ] Manikandan R commented on YARN-10043: - [~pbacsko] Thanks for your reviews. # Taken care. # Compare demands ensures entity without resource demand get lower priority than ones who have demands. When both entities has certain demands ( > 0), then there is no actual comparison. Hence the changes are in that way. It has been commented too at class level. Similar to {{FairSharePolicy}} implementation. # For #3 and #4, yes there are multiple asserts intended to clearly show the expected final ones are passing after asserting all earlier comparison (in the precedence order) has been passed as well. However, I see your point and made the changes by balancing both the views. > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch, YARN-10043.002.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025224#comment-17025224 ] Manikandan R commented on YARN-10043: - [~leftnoteasy] [~sunilg] Can you please take a look? > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch, YARN-10043.002.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9768) RM Renew Delegation token thread should timeout and retry
[ https://issues.apache.org/jira/browse/YARN-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-9768: --- Attachment: YARN-9768.010.patch > RM Renew Delegation token thread should timeout and retry > - > > Key: YARN-9768 > URL: https://issues.apache.org/jira/browse/YARN-9768 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: CR Hota >Assignee: Manikandan R >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9768.001.patch, YARN-9768.002.patch, > YARN-9768.003.patch, YARN-9768.004.patch, YARN-9768.005.patch, > YARN-9768.006.patch, YARN-9768.007.patch, YARN-9768.008.patch, > YARN-9768.009.patch, YARN-9768.010.patch > > > Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews > HDFS tokens received to check for validity and expiration time. > This call is made to an underlying HDFS NN or Router Node (which has exact > APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the > thread remains stuck indefinitely. The thread should ideally timeout the > renewToken and retry from the client's perspective. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9768) RM Renew Delegation token thread should timeout and retry
[ https://issues.apache.org/jira/browse/YARN-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024831#comment-17024831 ] Manikandan R commented on YARN-9768: Attaching 0.10 patch.. > RM Renew Delegation token thread should timeout and retry > - > > Key: YARN-9768 > URL: https://issues.apache.org/jira/browse/YARN-9768 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: CR Hota >Assignee: Manikandan R >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9768.001.patch, YARN-9768.002.patch, > YARN-9768.003.patch, YARN-9768.004.patch, YARN-9768.005.patch, > YARN-9768.006.patch, YARN-9768.007.patch, YARN-9768.008.patch, > YARN-9768.009.patch > > > Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews > HDFS tokens received to check for validity and expiration time. > This call is made to an underlying HDFS NN or Router Node (which has exact > APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the > thread remains stuck indefinitely. The thread should ideally timeout the > renewToken and retry from the client's perspective. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9768) RM Renew Delegation token thread should timeout and retry
[ https://issues.apache.org/jira/browse/YARN-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17023064#comment-17023064 ] Manikandan R commented on YARN-9768: can you trigger? > RM Renew Delegation token thread should timeout and retry > - > > Key: YARN-9768 > URL: https://issues.apache.org/jira/browse/YARN-9768 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: CR Hota >Assignee: Manikandan R >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9768.001.patch, YARN-9768.002.patch, > YARN-9768.003.patch, YARN-9768.004.patch, YARN-9768.005.patch, > YARN-9768.006.patch, YARN-9768.007.patch, YARN-9768.008.patch, > YARN-9768.009.patch > > > Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews > HDFS tokens received to check for validity and expiration time. > This call is made to an underlying HDFS NN or Router Node (which has exact > APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the > thread remains stuck indefinitely. The thread should ideally timeout the > renewToken and retry from the client's perspective. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10049) FIFOOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17023056#comment-17023056 ] Manikandan R commented on YARN-10049: - It depends on YARN-10043 changes. Requires rebasing after committing YARN-10043. > FIFOOrderingPolicy Improvements > --- > > Key: YARN-10049 > URL: https://issues.apache.org/jira/browse/YARN-10049 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10049.001.patch > > > FIFOPolicy of FS does the following comparisons in addition to app priority > comparison: > 1. Using Start time > 2. Using Name > Scope of this jira is to achieve the same comparisons in FIFOOrderingPolicy > of CS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10103) Capacity scheduler: add support for create=true/false per mapping rule
[ https://issues.apache.org/jira/browse/YARN-10103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17023050#comment-17023050 ] Manikandan R commented on YARN-10103: - [~pbacsko] I think AutoCreateEnabledParentQueue can be used to achieve leaf queue creation under parent queue (for which auto has been enabled) in conjunction with queue mapping rules. Please check yarn.scheduler.capacity..auto-create-child-queue.enabled property. > Capacity scheduler: add support for create=true/false per mapping rule > -- > > Key: YARN-10103 > URL: https://issues.apache.org/jira/browse/YARN-10103 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Priority: Major > > You can't ask Capacity Scheduler for a mapping to create a queue if it > doesn't exist. > For example, this mapping would use the first rule if the queue exist. If it > doesn't, then it proceeds to the next rule: > {{u:%user:%primary_group.%user:create=false;u:%user%:root.default}} > Let's say user "alice" belongs to the "admins" group. It would first try to > map {{root.admins.alice}}. But, if the queue doesn't exist, then it places > the application into {{root.default}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10102) Capacity scheduler: add support for %specified mapping
[ https://issues.apache.org/jira/browse/YARN-10102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17023044#comment-17023044 ] Manikandan R commented on YARN-10102: - [~pbacsko] I think yarn.scheduler.capacity.queue-mappings-override.enable does the same thing. Queue extracted from mapping rules would be used only if this property has been set to true. > Capacity scheduler: add support for %specified mapping > -- > > Key: YARN-10102 > URL: https://issues.apache.org/jira/browse/YARN-10102 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Priority: Major > > The reduce the gap between Fair Scheduler and Capacity Scheduler, it's > reasonable to have a {{%specified}} mapping. This would be equivalent to the > {{}} placement rule in FS, that is, use the queue that comes in > with the application submission context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-10043: Attachment: YARN-10043.002.patch > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch, YARN-10043.002.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022258#comment-17022258 ] Manikandan R commented on YARN-10043: - Fixed checkstyle, javadoc issues. Junit failure is not related to this patch. > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch, YARN-10043.002.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9768) RM Renew Delegation token thread should timeout and retry
[ https://issues.apache.org/jira/browse/YARN-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022152#comment-17022152 ] Manikandan R commented on YARN-9768: [~inigoiri] I ran this test 5 times, but haven't come across this timeout issue. Only 1 time, VM crash had occurred. In addition, I do see lot of {{java.util.concurrent.ExecutionException: java.lang.ArithmeticException: / by zero}} in logs. Seems it is related to YARN-9817 . Should we trigger Jenkins again and see? > RM Renew Delegation token thread should timeout and retry > - > > Key: YARN-9768 > URL: https://issues.apache.org/jira/browse/YARN-9768 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: CR Hota >Assignee: Manikandan R >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9768.001.patch, YARN-9768.002.patch, > YARN-9768.003.patch, YARN-9768.004.patch, YARN-9768.005.patch, > YARN-9768.006.patch, YARN-9768.007.patch, YARN-9768.008.patch, > YARN-9768.009.patch > > > Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews > HDFS tokens received to check for validity and expiration time. > This call is made to an underlying HDFS NN or Router Node (which has exact > APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the > thread remains stuck indefinitely. The thread should ideally timeout the > renewToken and retry from the client's perspective. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9768) RM Renew Delegation token thread should timeout and retry
[ https://issues.apache.org/jira/browse/YARN-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-9768: --- Attachment: YARN-9768.009.patch > RM Renew Delegation token thread should timeout and retry > - > > Key: YARN-9768 > URL: https://issues.apache.org/jira/browse/YARN-9768 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: CR Hota >Assignee: Manikandan R >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9768.001.patch, YARN-9768.002.patch, > YARN-9768.003.patch, YARN-9768.004.patch, YARN-9768.005.patch, > YARN-9768.006.patch, YARN-9768.007.patch, YARN-9768.008.patch, > YARN-9768.009.patch > > > Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews > HDFS tokens received to check for validity and expiration time. > This call is made to an underlying HDFS NN or Router Node (which has exact > APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the > thread remains stuck indefinitely. The thread should ideally timeout the > renewToken and retry from the client's perspective. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9768) RM Renew Delegation token thread should timeout and retry
[ https://issues.apache.org/jira/browse/YARN-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021106#comment-17021106 ] Manikandan R commented on YARN-9768: Rebased the patch. Can you please take it forward? > RM Renew Delegation token thread should timeout and retry > - > > Key: YARN-9768 > URL: https://issues.apache.org/jira/browse/YARN-9768 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: CR Hota >Assignee: Manikandan R >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9768.001.patch, YARN-9768.002.patch, > YARN-9768.003.patch, YARN-9768.004.patch, YARN-9768.005.patch, > YARN-9768.006.patch, YARN-9768.007.patch, YARN-9768.008.patch, > YARN-9768.009.patch > > > Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews > HDFS tokens received to check for validity and expiration time. > This call is made to an underlying HDFS NN or Router Node (which has exact > APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the > thread remains stuck indefinitely. The thread should ideally timeout the > renewToken and retry from the client's perspective. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9768) RM Renew Delegation token thread should timeout and retry
[ https://issues.apache.org/jira/browse/YARN-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17019628#comment-17019628 ] Manikandan R commented on YARN-9768: [~inigoiri] Can you please review his comment and commit the code? > RM Renew Delegation token thread should timeout and retry > - > > Key: YARN-9768 > URL: https://issues.apache.org/jira/browse/YARN-9768 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: CR Hota >Assignee: Manikandan R >Priority: Major > Attachments: YARN-9768.001.patch, YARN-9768.002.patch, > YARN-9768.003.patch, YARN-9768.004.patch, YARN-9768.005.patch, > YARN-9768.006.patch, YARN-9768.007.patch, YARN-9768.008.patch > > > Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews > HDFS tokens received to check for validity and expiration time. > This call is made to an underlying HDFS NN or Router Node (which has exact > APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the > thread remains stuck indefinitely. The thread should ideally timeout the > renewToken and retry from the client's perspective. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10049) FIFOOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17018890#comment-17018890 ] Manikandan R commented on YARN-10049: - Thanks [~sunilg] [~leftnoteasy] Attaching .001.patch based on earlier discussions. {{FIFOComparator}} has been used in FifoOrderingPolicy, FifoOrderingPolicyForPendingApps, FifoOrderingPolicyWithExclusivePartitions (through other class) and FairOrderingPolicy but in different order. So changes made in comparator would be applicable in all above policies. > FIFOOrderingPolicy Improvements > --- > > Key: YARN-10049 > URL: https://issues.apache.org/jira/browse/YARN-10049 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10049.001.patch > > > FIFOPolicy of FS does the following comparisons in addition to app priority > comparison: > 1. Using Start time > 2. Using Name > Scope of this jira is to achieve the same comparisons in FIFOOrderingPolicy > of CS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10049) FIFOOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-10049: Attachment: YARN-10049.001.patch > FIFOOrderingPolicy Improvements > --- > > Key: YARN-10049 > URL: https://issues.apache.org/jira/browse/YARN-10049 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10049.001.patch > > > FIFOPolicy of FS does the following comparisons in addition to app priority > comparison: > 1. Using Start time > 2. Using Name > Scope of this jira is to achieve the same comparisons in FIFOOrderingPolicy > of CS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-10043: Attachment: YARN-10043.001.patch > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-10043: Attachment: (was: YARN-10043.001.patch) > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9831) NMTokenSecretManagerInRM#createNMToken blocks ApplicationMasterService allocate flow
[ https://issues.apache.org/jira/browse/YARN-9831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17018585#comment-17018585 ] Manikandan R commented on YARN-9831: [~BilwaST] Spent sometime on this Jira earlier. Attaching patch developed then for your reference. Please use if you have similar thoughts and take it forward. > NMTokenSecretManagerInRM#createNMToken blocks ApplicationMasterService > allocate flow > > > Key: YARN-9831 > URL: https://issues.apache.org/jira/browse/YARN-9831 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bibin Chundatt >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-9831.001.patch > > > Currently attempt's NMToken cannot be generated independently. > Each attempts allocate flow blocks each other. We should improve the same -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9831) NMTokenSecretManagerInRM#createNMToken blocks ApplicationMasterService allocate flow
[ https://issues.apache.org/jira/browse/YARN-9831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-9831: --- Attachment: YARN-9831.001.patch > NMTokenSecretManagerInRM#createNMToken blocks ApplicationMasterService > allocate flow > > > Key: YARN-9831 > URL: https://issues.apache.org/jira/browse/YARN-9831 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bibin Chundatt >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-9831.001.patch > > > Currently attempt's NMToken cannot be generated independently. > Each attempts allocate flow blocks each other. We should improve the same -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-10043: Attachment: YARN-10043.001.patch > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17018584#comment-17018584 ] Manikandan R commented on YARN-10043: - Thanks [~leftnoteasy] and everyone for sharing your thoughts. Attached .001.patch based on our discussions. > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org