[jira] [Updated] (YARN-9505) Add container allocation latency for Opportunistic Scheduler
[ https://issues.apache.org/jira/browse/YARN-9505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9505: Attachment: YARN-9505.002.patch > Add container allocation latency for Opportunistic Scheduler > > > Key: YARN-9505 > URL: https://issues.apache.org/jira/browse/YARN-9505 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9505.001.patch, YARN-9505.002.patch > > > This will help in tuning the opportunistic scheduler and it's configuration > parameters. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9505) Add container allocation latency for Opportunistic Scheduler
[ https://issues.apache.org/jira/browse/YARN-9505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9505: Attachment: YARN-9505.001.patch > Add container allocation latency for Opportunistic Scheduler > > > Key: YARN-9505 > URL: https://issues.apache.org/jira/browse/YARN-9505 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9505.001.patch > > > This will help in tuning the opportunistic scheduler and it's configuration > parameters. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9505) Add container allocation latency for Opportunistic Scheduler
Abhishek Modi created YARN-9505: --- Summary: Add container allocation latency for Opportunistic Scheduler Key: YARN-9505 URL: https://issues.apache.org/jira/browse/YARN-9505 Project: Hadoop YARN Issue Type: Sub-task Reporter: Abhishek Modi Assignee: Abhishek Modi This will help in tuning the opportunistic scheduler and it's configuration parameters. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2889) Limit the number of opportunistic container allocated per AM heartbeat
[ https://issues.apache.org/jira/browse/YARN-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823302#comment-16823302 ] Abhishek Modi commented on YARN-2889: - Thanks [~elgoiri] for review and committing it. > Limit the number of opportunistic container allocated per AM heartbeat > -- > > Key: YARN-2889 > URL: https://issues.apache.org/jira/browse/YARN-2889 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Abhishek Modi >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-2889.001.patch, YARN-2889.002.patch, > YARN-2889.003.patch, YARN-2889.004.patch > > > We introduce a way to limit the number of opportunistic containers that will > be allocated on each AM heartbeat. > This way we can restrict the number of opportunistic containers handed out > by the system, as well as throttle down misbehaving AMs (asking for too many > opportunistic containers). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2889) Limit the number of opportunistic container allocated per AM heartbeat
[ https://issues.apache.org/jira/browse/YARN-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16822434#comment-16822434 ] Abhishek Modi commented on YARN-2889: - Check-style issue is due to more number of parameters. > Limit the number of opportunistic container allocated per AM heartbeat > -- > > Key: YARN-2889 > URL: https://issues.apache.org/jira/browse/YARN-2889 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-2889.001.patch, YARN-2889.002.patch, > YARN-2889.003.patch, YARN-2889.004.patch > > > We introduce a way to limit the number of opportunistic containers that will > be allocated on each AM heartbeat. > This way we can restrict the number of opportunistic containers handed out > by the system, as well as throttle down misbehaving AMs (asking for too many > opportunistic containers). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-2889) Limit the number of opportunistic container allocated per AM heartbeat
[ https://issues.apache.org/jira/browse/YARN-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-2889: Attachment: YARN-2889.004.patch > Limit the number of opportunistic container allocated per AM heartbeat > -- > > Key: YARN-2889 > URL: https://issues.apache.org/jira/browse/YARN-2889 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-2889.001.patch, YARN-2889.002.patch, > YARN-2889.003.patch, YARN-2889.004.patch > > > We introduce a way to limit the number of opportunistic containers that will > be allocated on each AM heartbeat. > This way we can restrict the number of opportunistic containers handed out > by the system, as well as throttle down misbehaving AMs (asking for too many > opportunistic containers). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9339) Apps pending metric incorrect after moving app to a new queue
[ https://issues.apache.org/jira/browse/YARN-9339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16822072#comment-16822072 ] Abhishek Modi commented on YARN-9339: - Thanks [~elgoiri] for review. I ran both tests locally with my changes and they are passing. testDecreaseAfterIncreaseWithAllocationExpiration keeps randomly failing in other builds also - will file a jira for fixing this. TestFairSchdulerPreemption also fails intermittently - there is already an open jira for this: YARN-9333. > Apps pending metric incorrect after moving app to a new queue > - > > Key: YARN-9339 > URL: https://issues.apache.org/jira/browse/YARN-9339 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Billie Rinaldi >Assignee: Abhishek Modi >Priority: Minor > Attachments: YARN-9339.001.patch, YARN-9339.002.patch, > YARN-9339.003.patch, YARN-9339.004.patch > > > I observed a cluster that had a high Apps Pending count that appeared to be > incorrect. This seemed to be related to apps being moved to different queues. > I tested by adding some logging to TestCapacityScheduler#testMoveAppBasic > before and after a moveApplication call. Before the call appsPending was 1 > and afterwards appsPending was 2. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9339) Apps pending metric incorrect after moving app to a new queue
[ https://issues.apache.org/jira/browse/YARN-9339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9339: Attachment: YARN-9339.004.patch > Apps pending metric incorrect after moving app to a new queue > - > > Key: YARN-9339 > URL: https://issues.apache.org/jira/browse/YARN-9339 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Billie Rinaldi >Assignee: Abhishek Modi >Priority: Minor > Attachments: YARN-9339.001.patch, YARN-9339.002.patch, > YARN-9339.003.patch, YARN-9339.004.patch > > > I observed a cluster that had a high Apps Pending count that appeared to be > incorrect. This seemed to be related to apps being moved to different queues. > I tested by adding some logging to TestCapacityScheduler#testMoveAppBasic > before and after a moveApplication call. Before the call appsPending was 1 > and afterwards appsPending was 2. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9339) Apps pending metric incorrect after moving app to a new queue
[ https://issues.apache.org/jira/browse/YARN-9339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16821690#comment-16821690 ] Abhishek Modi commented on YARN-9339: - Thanks [~elgoiri] for review. Attached v4 patch with fixes. > Apps pending metric incorrect after moving app to a new queue > - > > Key: YARN-9339 > URL: https://issues.apache.org/jira/browse/YARN-9339 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Billie Rinaldi >Assignee: Abhishek Modi >Priority: Minor > Attachments: YARN-9339.001.patch, YARN-9339.002.patch, > YARN-9339.003.patch, YARN-9339.004.patch > > > I observed a cluster that had a high Apps Pending count that appeared to be > incorrect. This seemed to be related to apps being moved to different queues. > I tested by adding some logging to TestCapacityScheduler#testMoveAppBasic > before and after a moveApplication call. Before the call appsPending was 1 > and afterwards appsPending was 2. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2889) Limit the number of opportunistic container allocated per AM heartbeat
[ https://issues.apache.org/jira/browse/YARN-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820823#comment-16820823 ] Abhishek Modi commented on YARN-2889: - Thanks [~elgoiri] for review: * Let's avoid using "luser" - this is being used across all tests. Should I change them all to something else? * There are already tests that are testing for large number of allocations: testLotsOfContainersRackLocalAllocationSameSchedKey and testLotsOfContainersRackLocalAllocation are already covering those cases. > Limit the number of opportunistic container allocated per AM heartbeat > -- > > Key: YARN-2889 > URL: https://issues.apache.org/jira/browse/YARN-2889 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-2889.001.patch, YARN-2889.002.patch, > YARN-2889.003.patch > > > We introduce a way to limit the number of opportunistic containers that will > be allocated on each AM heartbeat. > This way we can restrict the number of opportunistic containers handed out > by the system, as well as throttle down misbehaving AMs (asking for too many > opportunistic containers). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9448) Fix Opportunistic Scheduling for node local allocations.
[ https://issues.apache.org/jira/browse/YARN-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820808#comment-16820808 ] Abhishek Modi commented on YARN-9448: - Thanks [~elgoiri] for review. Attached v4 patch with the fix. > Fix Opportunistic Scheduling for node local allocations. > > > Key: YARN-9448 > URL: https://issues.apache.org/jira/browse/YARN-9448 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9448.001.patch, YARN-9448.002.patch, > YARN-9448.003.patch, YARN-9448.004.patch > > > Right now, opportunistic container might not get allocated on rack local node > even if it's available. > Nodes are right now blacklisted if any container except node local container > is allocated on that node. In case, if previously container was allocated on > that node, that node wouldn't be even considered even if there is an ask for > node local request. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9448) Fix Opportunistic Scheduling for node local allocations.
[ https://issues.apache.org/jira/browse/YARN-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9448: Attachment: YARN-9448.004.patch > Fix Opportunistic Scheduling for node local allocations. > > > Key: YARN-9448 > URL: https://issues.apache.org/jira/browse/YARN-9448 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9448.001.patch, YARN-9448.002.patch, > YARN-9448.003.patch, YARN-9448.004.patch > > > Right now, opportunistic container might not get allocated on rack local node > even if it's available. > Nodes are right now blacklisted if any container except node local container > is allocated on that node. In case, if previously container was allocated on > that node, that node wouldn't be even considered even if there is an ask for > node local request. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9339) Apps pending metric incorrect after moving app to a new queue
[ https://issues.apache.org/jira/browse/YARN-9339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820331#comment-16820331 ] Abhishek Modi commented on YARN-9339: - None of the failures are related to this patch. > Apps pending metric incorrect after moving app to a new queue > - > > Key: YARN-9339 > URL: https://issues.apache.org/jira/browse/YARN-9339 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Billie Rinaldi >Assignee: Abhishek Modi >Priority: Minor > Attachments: YARN-9339.001.patch, YARN-9339.002.patch, > YARN-9339.003.patch > > > I observed a cluster that had a high Apps Pending count that appeared to be > incorrect. This seemed to be related to apps being moved to different queues. > I tested by adding some logging to TestCapacityScheduler#testMoveAppBasic > before and after a moveApplication call. Before the call appsPending was 1 > and afterwards appsPending was 2. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9339) Apps pending metric incorrect after moving app to a new queue
[ https://issues.apache.org/jira/browse/YARN-9339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9339: Attachment: YARN-9339.003.patch > Apps pending metric incorrect after moving app to a new queue > - > > Key: YARN-9339 > URL: https://issues.apache.org/jira/browse/YARN-9339 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Billie Rinaldi >Assignee: Abhishek Modi >Priority: Minor > Attachments: YARN-9339.001.patch, YARN-9339.002.patch, > YARN-9339.003.patch > > > I observed a cluster that had a high Apps Pending count that appeared to be > incorrect. This seemed to be related to apps being moved to different queues. > I tested by adding some logging to TestCapacityScheduler#testMoveAppBasic > before and after a moveApplication call. Before the call appsPending was 1 > and afterwards appsPending was 2. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9448) Fix Opportunistic Scheduling for node local allocations.
[ https://issues.apache.org/jira/browse/YARN-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16818264#comment-16818264 ] Abhishek Modi commented on YARN-9448: - Thanks [~elgoiri] for review. Attached v3 patch with more detailed comments in test. > Fix Opportunistic Scheduling for node local allocations. > > > Key: YARN-9448 > URL: https://issues.apache.org/jira/browse/YARN-9448 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9448.001.patch, YARN-9448.002.patch, > YARN-9448.003.patch > > > Right now, opportunistic container might not get allocated on rack local node > even if it's available. > Nodes are right now blacklisted if any container except node local container > is allocated on that node. In case, if previously container was allocated on > that node, that node wouldn't be even considered even if there is an ask for > node local request. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9448) Fix Opportunistic Scheduling for node local allocations.
[ https://issues.apache.org/jira/browse/YARN-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9448: Attachment: YARN-9448.003.patch > Fix Opportunistic Scheduling for node local allocations. > > > Key: YARN-9448 > URL: https://issues.apache.org/jira/browse/YARN-9448 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9448.001.patch, YARN-9448.002.patch, > YARN-9448.003.patch > > > Right now, opportunistic container might not get allocated on rack local node > even if it's available. > Nodes are right now blacklisted if any container except node local container > is allocated on that node. In case, if previously container was allocated on > that node, that node wouldn't be even considered even if there is an ask for > node local request. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2889) Limit the number of opportunistic container allocated per AM heartbeat
[ https://issues.apache.org/jira/browse/YARN-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16818212#comment-16818212 ] Abhishek Modi commented on YARN-2889: - Thanks [~elgoiri] for review. Attached v3 patch addressing the comments. > Limit the number of opportunistic container allocated per AM heartbeat > -- > > Key: YARN-2889 > URL: https://issues.apache.org/jira/browse/YARN-2889 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-2889.001.patch, YARN-2889.002.patch, > YARN-2889.003.patch > > > We introduce a way to limit the number of opportunistic containers that will > be allocated on each AM heartbeat. > This way we can restrict the number of opportunistic containers handed out > by the system, as well as throttle down misbehaving AMs (asking for too many > opportunistic containers). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-2889) Limit the number of opportunistic container allocated per AM heartbeat
[ https://issues.apache.org/jira/browse/YARN-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-2889: Attachment: YARN-2889.003.patch > Limit the number of opportunistic container allocated per AM heartbeat > -- > > Key: YARN-2889 > URL: https://issues.apache.org/jira/browse/YARN-2889 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-2889.001.patch, YARN-2889.002.patch, > YARN-2889.003.patch > > > We introduce a way to limit the number of opportunistic containers that will > be allocated on each AM heartbeat. > This way we can restrict the number of opportunistic containers handed out > by the system, as well as throttle down misbehaving AMs (asking for too many > opportunistic containers). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9474) Remove hard coded sleep from Opportunistic Scheduler tests.
[ https://issues.apache.org/jira/browse/YARN-9474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16817517#comment-16817517 ] Abhishek Modi commented on YARN-9474: - Thanks [~elgoiri] for review and committing it. > Remove hard coded sleep from Opportunistic Scheduler tests. > --- > > Key: YARN-9474 > URL: https://issues.apache.org/jira/browse/YARN-9474 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9474.001.patch, YARN-9474.002.patch > > > Remove hard coded sleep from Opportunistic Scheduler tests and improve logs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9474) Remove hard coded sleep from Opportunistic Scheduler tests.
[ https://issues.apache.org/jira/browse/YARN-9474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16817176#comment-16817176 ] Abhishek Modi commented on YARN-9474: - Assert for null for rmContaner is not required as we are doing the null check above and assigning it to new variable. That was a redundant Assert so removed it. > Remove hard coded sleep from Opportunistic Scheduler tests. > --- > > Key: YARN-9474 > URL: https://issues.apache.org/jira/browse/YARN-9474 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9474.001.patch, YARN-9474.002.patch > > > Remove hard coded sleep from Opportunistic Scheduler tests and improve logs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9474) Remove hard coded sleep from Opportunistic Scheduler tests.
[ https://issues.apache.org/jira/browse/YARN-9474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16816829#comment-16816829 ] Abhishek Modi commented on YARN-9474: - Test failure and findbugs warnings are not related to this patch. > Remove hard coded sleep from Opportunistic Scheduler tests. > --- > > Key: YARN-9474 > URL: https://issues.apache.org/jira/browse/YARN-9474 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9474.001.patch, YARN-9474.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9474) Remove hard coded sleep from Opportunistic Scheduler tests.
[ https://issues.apache.org/jira/browse/YARN-9474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16816575#comment-16816575 ] Abhishek Modi commented on YARN-9474: - Thanks [~elgoiri] for reviewing it. I have fixed the review comments in v2 patch. > Remove hard coded sleep from Opportunistic Scheduler tests. > --- > > Key: YARN-9474 > URL: https://issues.apache.org/jira/browse/YARN-9474 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9474.001.patch, YARN-9474.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9474) Remove hard coded sleep from Opportunistic Scheduler tests.
[ https://issues.apache.org/jira/browse/YARN-9474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9474: Attachment: YARN-9474.002.patch > Remove hard coded sleep from Opportunistic Scheduler tests. > --- > > Key: YARN-9474 > URL: https://issues.apache.org/jira/browse/YARN-9474 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9474.001.patch, YARN-9474.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9474) Remove hard coded sleep from Opportunistic Scheduler tests.
[ https://issues.apache.org/jira/browse/YARN-9474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16816288#comment-16816288 ] Abhishek Modi commented on YARN-9474: - Findbugs warning is unrelated to this change. [~elgoiri] [~giovanni.fumarola] could you please review this. Thanks. > Remove hard coded sleep from Opportunistic Scheduler tests. > --- > > Key: YARN-9474 > URL: https://issues.apache.org/jira/browse/YARN-9474 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9474.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9474) Remove hard coded sleep from Opportunistic Scheduler tests.
Abhishek Modi created YARN-9474: --- Summary: Remove hard coded sleep from Opportunistic Scheduler tests. Key: YARN-9474 URL: https://issues.apache.org/jira/browse/YARN-9474 Project: Hadoop YARN Issue Type: Sub-task Reporter: Abhishek Modi Assignee: Abhishek Modi -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9435) Add Opportunistic Scheduler metrics in ResourceManager.
[ https://issues.apache.org/jira/browse/YARN-9435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16816109#comment-16816109 ] Abhishek Modi commented on YARN-9435: - Thanks [~giovanni.fumarola] for review and committing it to trunk. > Add Opportunistic Scheduler metrics in ResourceManager. > --- > > Key: YARN-9435 > URL: https://issues.apache.org/jira/browse/YARN-9435 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9435.001.patch, YARN-9435.002.patch, > YARN-9435.003.patch, YARN-9435.004.patch > > > Right now there are no metrics available for Opportunistic Scheduler at > ResourceManager. As part of this jira, we will add metrics like number of > allocated opportunistic containers, released opportunistic containers, node > level allocations, rack level allocations etc. for Opportunistic Scheduler. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9339) Apps pending metric incorrect after moving app to a new queue
[ https://issues.apache.org/jira/browse/YARN-9339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815646#comment-16815646 ] Abhishek Modi commented on YARN-9339: - Thanks [~elgoiri] for review. I will address the comments in next patch. > Apps pending metric incorrect after moving app to a new queue > - > > Key: YARN-9339 > URL: https://issues.apache.org/jira/browse/YARN-9339 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Billie Rinaldi >Assignee: Abhishek Modi >Priority: Minor > Attachments: YARN-9339.001.patch, YARN-9339.002.patch > > > I observed a cluster that had a high Apps Pending count that appeared to be > incorrect. This seemed to be related to apps being moved to different queues. > I tested by adding some logging to TestCapacityScheduler#testMoveAppBasic > before and after a moveApplication call. Before the call appsPending was 1 > and afterwards appsPending was 2. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9435) Add Opportunistic Scheduler metrics in ResourceManager.
[ https://issues.apache.org/jira/browse/YARN-9435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815335#comment-16815335 ] Abhishek Modi commented on YARN-9435: - None of the findbugs warnings is related to the patch. > Add Opportunistic Scheduler metrics in ResourceManager. > --- > > Key: YARN-9435 > URL: https://issues.apache.org/jira/browse/YARN-9435 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9435.001.patch, YARN-9435.002.patch, > YARN-9435.003.patch, YARN-9435.004.patch > > > Right now there are no metrics available for Opportunistic Scheduler at > ResourceManager. As part of this jira, we will add metrics like number of > allocated opportunistic containers, released opportunistic containers, node > level allocations, rack level allocations etc. for Opportunistic Scheduler. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9435) Add Opportunistic Scheduler metrics in ResourceManager.
[ https://issues.apache.org/jira/browse/YARN-9435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815133#comment-16815133 ] Abhishek Modi commented on YARN-9435: - Thanks [~giovanni.fumarola] for reviewing this. I have attached v4 patch after removing sleep. > Add Opportunistic Scheduler metrics in ResourceManager. > --- > > Key: YARN-9435 > URL: https://issues.apache.org/jira/browse/YARN-9435 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9435.001.patch, YARN-9435.002.patch, > YARN-9435.003.patch, YARN-9435.004.patch > > > Right now there are no metrics available for Opportunistic Scheduler at > ResourceManager. As part of this jira, we will add metrics like number of > allocated opportunistic containers, released opportunistic containers, node > level allocations, rack level allocations etc. for Opportunistic Scheduler. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9435) Add Opportunistic Scheduler metrics in ResourceManager.
[ https://issues.apache.org/jira/browse/YARN-9435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9435: Attachment: YARN-9435.004.patch > Add Opportunistic Scheduler metrics in ResourceManager. > --- > > Key: YARN-9435 > URL: https://issues.apache.org/jira/browse/YARN-9435 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9435.001.patch, YARN-9435.002.patch, > YARN-9435.003.patch, YARN-9435.004.patch > > > Right now there are no metrics available for Opportunistic Scheduler at > ResourceManager. As part of this jira, we will add metrics like number of > allocated opportunistic containers, released opportunistic containers, node > level allocations, rack level allocations etc. for Opportunistic Scheduler. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9448) Fix Opportunistic Scheduling for node local allocations.
[ https://issues.apache.org/jira/browse/YARN-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9448: Attachment: YARN-9448.002.patch > Fix Opportunistic Scheduling for node local allocations. > > > Key: YARN-9448 > URL: https://issues.apache.org/jira/browse/YARN-9448 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9448.001.patch, YARN-9448.002.patch > > > Right now, opportunistic container might not get allocated on rack local node > even if it's available. > Nodes are right now blacklisted if any container except node local container > is allocated on that node. In case, if previously container was allocated on > that node, that node wouldn't be even considered even if there is an ask for > node local request. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3488) AM get timeline service info from RM rather than Application specific configuration.
[ https://issues.apache.org/jira/browse/YARN-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16811452#comment-16811452 ] Abhishek Modi commented on YARN-3488: - Findbugs warnings are not related to this patch. > AM get timeline service info from RM rather than Application specific > configuration. > > > Key: YARN-3488 > URL: https://issues.apache.org/jira/browse/YARN-3488 > Project: Hadoop YARN > Issue Type: Sub-task > Components: applications >Reporter: Junping Du >Assignee: Abhishek Modi >Priority: Major > Labels: YARN-5355 > Attachments: YARN-3488.001.patch, YARN-3488.002.patch, > YARN-3488.003.patch, YARN-3488.004.patch > > > Since v1 timeline service, we have MR configuration to enable/disable putting > history event to timeline service. For today's v2 timeline service ongoing > effort, currently we have different methods/structures between v1 and v2 for > consuming TimelineClient, so application have to be aware of which version > timeline service get used there. > There are basically two options here: > First option is as current way in DistributedShell or MR to let application > has specific configuration to point out that if enabling ATS and which > version could be, like: MRJobConfig.MAPREDUCE_JOB_EMIT_TIMELINE_DATA, etc. > The other option is to let application to figure out timeline related info > from YARN/RM, it can be done through registerApplicationMaster() in > ApplicationMasterProtocol with return value for service "off", "v1_on", or > "v2_on". > We prefer the latter option because application owner doesn't have to aware > RM/YARN infrastructure details. Please note that we should keep compatible > (consistent behavior with the same setting) with released configurations. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9382) Publish container killed, paused and resumed events to ATSv2.
[ https://issues.apache.org/jira/browse/YARN-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16811445#comment-16811445 ] Abhishek Modi commented on YARN-9382: - Thanks [~vrushalic] for reviewing and committing it. > Publish container killed, paused and resumed events to ATSv2. > - > > Key: YARN-9382 > URL: https://issues.apache.org/jira/browse/YARN-9382 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Labels: atsv2 > Fix For: 3.3.0 > > Attachments: YARN-9382.001.patch, YARN-9382.002.patch, > YARN-9382.003.patch > > > There are some events missing in container lifecycle. We need to add support > for adding events for when container gets killed, paused and resumed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9335) [atsv2] Restrict the number of elements held in timeline collector when backend is unreachable for async calls
[ https://issues.apache.org/jira/browse/YARN-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16811444#comment-16811444 ] Abhishek Modi commented on YARN-9335: - Thanks [~vrushalic] for review and committing it. > [atsv2] Restrict the number of elements held in timeline collector when > backend is unreachable for async calls > -- > > Key: YARN-9335 > URL: https://issues.apache.org/jira/browse/YARN-9335 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vrushali C >Assignee: Abhishek Modi >Priority: Major > Labels: atvs > Fix For: 3.3.0 > > Attachments: YARN-9335.001.patch, YARN-9335.002.patch, > YARN-9335.003.patch, YARN-9335.004.patch > > > For ATSv2 , if the backend is unreachable, the number/size of data held in > timeline collector's memory increases significantly. This is not good for the > NM memory. > Filing jira to set a limit on how many/much should be retained by the > timeline collector in memory in case the backend is not reachable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3488) AM get timeline service info from RM rather than Application specific configuration.
[ https://issues.apache.org/jira/browse/YARN-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16811166#comment-16811166 ] Abhishek Modi commented on YARN-3488: - Attached a new patch after merging with trunk. > AM get timeline service info from RM rather than Application specific > configuration. > > > Key: YARN-3488 > URL: https://issues.apache.org/jira/browse/YARN-3488 > Project: Hadoop YARN > Issue Type: Sub-task > Components: applications >Reporter: Junping Du >Assignee: Abhishek Modi >Priority: Major > Labels: YARN-5355 > Attachments: YARN-3488.001.patch, YARN-3488.002.patch, > YARN-3488.003.patch, YARN-3488.004.patch > > > Since v1 timeline service, we have MR configuration to enable/disable putting > history event to timeline service. For today's v2 timeline service ongoing > effort, currently we have different methods/structures between v1 and v2 for > consuming TimelineClient, so application have to be aware of which version > timeline service get used there. > There are basically two options here: > First option is as current way in DistributedShell or MR to let application > has specific configuration to point out that if enabling ATS and which > version could be, like: MRJobConfig.MAPREDUCE_JOB_EMIT_TIMELINE_DATA, etc. > The other option is to let application to figure out timeline related info > from YARN/RM, it can be done through registerApplicationMaster() in > ApplicationMasterProtocol with return value for service "off", "v1_on", or > "v2_on". > We prefer the latter option because application owner doesn't have to aware > RM/YARN infrastructure details. Please note that we should keep compatible > (consistent behavior with the same setting) with released configurations. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9448) Fix Opportunistic Scheduling for node local allocations.
Abhishek Modi created YARN-9448: --- Summary: Fix Opportunistic Scheduling for node local allocations. Key: YARN-9448 URL: https://issues.apache.org/jira/browse/YARN-9448 Project: Hadoop YARN Issue Type: Sub-task Reporter: Abhishek Modi Assignee: Abhishek Modi Right now, opportunistic container might not get allocated on rack local node even if it's available. Nodes are right now blacklisted if any container except node local container is allocated on that node. In case, if previously container was allocated on that node, that node wouldn't be even considered even if there is an ask for node local request. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3488) AM get timeline service info from RM rather than Application specific configuration.
[ https://issues.apache.org/jira/browse/YARN-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-3488: Attachment: YARN-3488.004.patch > AM get timeline service info from RM rather than Application specific > configuration. > > > Key: YARN-3488 > URL: https://issues.apache.org/jira/browse/YARN-3488 > Project: Hadoop YARN > Issue Type: Sub-task > Components: applications >Reporter: Junping Du >Assignee: Abhishek Modi >Priority: Major > Labels: YARN-5355 > Attachments: YARN-3488.001.patch, YARN-3488.002.patch, > YARN-3488.003.patch, YARN-3488.004.patch > > > Since v1 timeline service, we have MR configuration to enable/disable putting > history event to timeline service. For today's v2 timeline service ongoing > effort, currently we have different methods/structures between v1 and v2 for > consuming TimelineClient, so application have to be aware of which version > timeline service get used there. > There are basically two options here: > First option is as current way in DistributedShell or MR to let application > has specific configuration to point out that if enabling ATS and which > version could be, like: MRJobConfig.MAPREDUCE_JOB_EMIT_TIMELINE_DATA, etc. > The other option is to let application to figure out timeline related info > from YARN/RM, it can be done through registerApplicationMaster() in > ApplicationMasterProtocol with return value for service "off", "v1_on", or > "v2_on". > We prefer the latter option because application owner doesn't have to aware > RM/YARN infrastructure details. Please note that we should keep compatible > (consistent behavior with the same setting) with released configurations. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-8647) Add a flag to disable move app between queues
[ https://issues.apache.org/jira/browse/YARN-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi resolved YARN-8647. - Resolution: Won't Fix > Add a flag to disable move app between queues > - > > Key: YARN-8647 > URL: https://issues.apache.org/jira/browse/YARN-8647 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.3 >Reporter: sarun singla >Assignee: Abhishek Modi >Priority: Critical > > For large clusters where we have a number of users submitting application, we > can result into scenarios where app developers try to move the queues for > their applications using something like > {code:java} > yarn application -movetoqueue -queue {code} > Today there is no way of disabling the feature if one does not want > application developers to use the feature. > *Solution:* > We should probably add an option to disable move queue feature from RM side > on the cluster level. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9382) Publish container killed, paused and resumed events to ATSv2.
[ https://issues.apache.org/jira/browse/YARN-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810524#comment-16810524 ] Abhishek Modi commented on YARN-9382: - None of the findbugs warning are related to this patch. > Publish container killed, paused and resumed events to ATSv2. > - > > Key: YARN-9382 > URL: https://issues.apache.org/jira/browse/YARN-9382 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9382.001.patch, YARN-9382.002.patch, > YARN-9382.003.patch > > > There are some events missing in container lifecycle. We need to add support > for adding events for when container gets killed, paused and resumed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9335) [atsv2] Restrict the number of elements held in timeline collector when backend is unreachable for async calls
[ https://issues.apache.org/jira/browse/YARN-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810036#comment-16810036 ] Abhishek Modi commented on YARN-9335: - Thanks [~vrushalic]. There were some conflicts due to some changes in trunk. Attached a new patch after resolving conflicts. Thanks. > [atsv2] Restrict the number of elements held in timeline collector when > backend is unreachable for async calls > -- > > Key: YARN-9335 > URL: https://issues.apache.org/jira/browse/YARN-9335 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vrushali C >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9335.001.patch, YARN-9335.002.patch, > YARN-9335.003.patch, YARN-9335.004.patch > > > For ATSv2 , if the backend is unreachable, the number/size of data held in > timeline collector's memory increases significantly. This is not good for the > NM memory. > Filing jira to set a limit on how many/much should be retained by the > timeline collector in memory in case the backend is not reachable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9335) [atsv2] Restrict the number of elements held in timeline collector when backend is unreachable for async calls
[ https://issues.apache.org/jira/browse/YARN-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9335: Attachment: YARN-9335.004.patch > [atsv2] Restrict the number of elements held in timeline collector when > backend is unreachable for async calls > -- > > Key: YARN-9335 > URL: https://issues.apache.org/jira/browse/YARN-9335 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vrushali C >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9335.001.patch, YARN-9335.002.patch, > YARN-9335.003.patch, YARN-9335.004.patch > > > For ATSv2 , if the backend is unreachable, the number/size of data held in > timeline collector's memory increases significantly. This is not good for the > NM memory. > Filing jira to set a limit on how many/much should be retained by the > timeline collector in memory in case the backend is not reachable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9382) Publish container killed, paused and resumed events to ATSv2.
[ https://issues.apache.org/jira/browse/YARN-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809934#comment-16809934 ] Abhishek Modi commented on YARN-9382: - [~vrushalic] I looked into it and due to some recent changes is trunk there was a conflict. I attached a new patch after resolving the conflicts. Thanks. > Publish container killed, paused and resumed events to ATSv2. > - > > Key: YARN-9382 > URL: https://issues.apache.org/jira/browse/YARN-9382 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9382.001.patch, YARN-9382.002.patch, > YARN-9382.003.patch > > > There are some events missing in container lifecycle. We need to add support > for adding events for when container gets killed, paused and resumed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9382) Publish container killed, paused and resumed events to ATSv2.
[ https://issues.apache.org/jira/browse/YARN-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9382: Attachment: YARN-9382.003.patch > Publish container killed, paused and resumed events to ATSv2. > - > > Key: YARN-9382 > URL: https://issues.apache.org/jira/browse/YARN-9382 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9382.001.patch, YARN-9382.002.patch, > YARN-9382.003.patch > > > There are some events missing in container lifecycle. We need to add support > for adding events for when container gets killed, paused and resumed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9335) [atsv2] Restrict the number of elements held in timeline collector when backend is unreachable for async calls
[ https://issues.apache.org/jira/browse/YARN-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809539#comment-16809539 ] Abhishek Modi commented on YARN-9335: - Thanks [~vrushalic]. I will check at my end. Let me also run complete UTs with patch as I am afraid it can cause some other UT failures as we have made writes async. > [atsv2] Restrict the number of elements held in timeline collector when > backend is unreachable for async calls > -- > > Key: YARN-9335 > URL: https://issues.apache.org/jira/browse/YARN-9335 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vrushali C >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9335.001.patch, YARN-9335.002.patch, > YARN-9335.003.patch > > > For ATSv2 , if the backend is unreachable, the number/size of data held in > timeline collector's memory increases significantly. This is not good for the > NM memory. > Filing jira to set a limit on how many/much should be retained by the > timeline collector in memory in case the backend is not reachable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9382) Publish container killed, paused and resumed events to ATSv2.
[ https://issues.apache.org/jira/browse/YARN-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809536#comment-16809536 ] Abhishek Modi commented on YARN-9382: - Thanks Vrushali - let me check at my end. > Publish container killed, paused and resumed events to ATSv2. > - > > Key: YARN-9382 > URL: https://issues.apache.org/jira/browse/YARN-9382 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9382.001.patch, YARN-9382.002.patch > > > There are some events missing in container lifecycle. We need to add support > for adding events for when container gets killed, paused and resumed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9435) Add Opportunistic Scheduler metrics in ResourceManager.
[ https://issues.apache.org/jira/browse/YARN-9435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9435: Attachment: YARN-9435.003.patch > Add Opportunistic Scheduler metrics in ResourceManager. > --- > > Key: YARN-9435 > URL: https://issues.apache.org/jira/browse/YARN-9435 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9435.001.patch, YARN-9435.002.patch, > YARN-9435.003.patch > > > Right now there are no metrics available for Opportunistic Scheduler at > ResourceManager. As part of this jira, we will add metrics like number of > allocated opportunistic containers, released opportunistic containers, node > level allocations, rack level allocations etc. for Opportunistic Scheduler. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9435) Add Opportunistic Scheduler metrics in ResourceManager.
[ https://issues.apache.org/jira/browse/YARN-9435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9435: Attachment: YARN-9435.002.patch > Add Opportunistic Scheduler metrics in ResourceManager. > --- > > Key: YARN-9435 > URL: https://issues.apache.org/jira/browse/YARN-9435 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9435.001.patch, YARN-9435.002.patch > > > Right now there are no metrics available for Opportunistic Scheduler at > ResourceManager. As part of this jira, we will add metrics like number of > allocated opportunistic containers, released opportunistic containers, node > level allocations, rack level allocations etc. for Opportunistic Scheduler. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9435) Add Opportunistic Scheduler metrics in ResourceManager.
Abhishek Modi created YARN-9435: --- Summary: Add Opportunistic Scheduler metrics in ResourceManager. Key: YARN-9435 URL: https://issues.apache.org/jira/browse/YARN-9435 Project: Hadoop YARN Issue Type: Sub-task Reporter: Abhishek Modi Assignee: Abhishek Modi Right now there are no metrics available for Opportunistic Scheduler at ResourceManager. As part of this jira, we will add metrics like number of allocated opportunistic containers, released opportunistic containers, node level allocations, rack level allocations etc. for Opportunistic Scheduler. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9428) Add metrics for paused containers in NodeManager
[ https://issues.apache.org/jira/browse/YARN-9428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807362#comment-16807362 ] Abhishek Modi commented on YARN-9428: - Thanks [~giovanni.fumarola] for review and committing it. Thanks. > Add metrics for paused containers in NodeManager > > > Key: YARN-9428 > URL: https://issues.apache.org/jira/browse/YARN-9428 > Project: Hadoop YARN > Issue Type: Task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9428.001.patch, YARN-9428.002.patch > > > Add metrics for paused containers in NodeManager. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2889) Limit the number of opportunistic container allocated per AM heartbeat
[ https://issues.apache.org/jira/browse/YARN-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807363#comment-16807363 ] Abhishek Modi commented on YARN-2889: - [~giovanni.fumarola] could you please review it. Thanks. > Limit the number of opportunistic container allocated per AM heartbeat > -- > > Key: YARN-2889 > URL: https://issues.apache.org/jira/browse/YARN-2889 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-2889.001.patch, YARN-2889.002.patch > > > We introduce a way to limit the number of opportunistic containers that will > be allocated on each AM heartbeat. > This way we can restrict the number of opportunistic containers handed out > by the system, as well as throttle down misbehaving AMs (asking for too many > opportunistic containers). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-2889) Limit the number of opportunistic container allocated per AM heartbeat
[ https://issues.apache.org/jira/browse/YARN-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-2889: Description: We introduce a way to limit the number of opportunistic containers that will be allocated on each AM heartbeat. This way we can restrict the number of opportunistic containers handed out by the system, as well as throttle down misbehaving AMs (asking for too many opportunistic containers). was: We introduce a way to limit the number of queueable requests that each AM can submit to the LocalRM. This way we can restrict the number of queueable containers handed out by the system, as well as throttle down misbehaving AMs (asking for too many queueable containers). > Limit the number of opportunistic container allocated per AM heartbeat > -- > > Key: YARN-2889 > URL: https://issues.apache.org/jira/browse/YARN-2889 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-2889.001.patch, YARN-2889.002.patch > > > We introduce a way to limit the number of opportunistic containers that will > be allocated on each AM heartbeat. > This way we can restrict the number of opportunistic containers handed out > by the system, as well as throttle down misbehaving AMs (asking for too many > opportunistic containers). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-2889) Limit the number of opportunistic container allocated per AM heartbeat
[ https://issues.apache.org/jira/browse/YARN-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-2889: Attachment: YARN-2889.002.patch > Limit the number of opportunistic container allocated per AM heartbeat > -- > > Key: YARN-2889 > URL: https://issues.apache.org/jira/browse/YARN-2889 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-2889.001.patch, YARN-2889.002.patch > > > We introduce a way to limit the number of queueable requests that each AM can > submit to the LocalRM. > This way we can restrict the number of queueable containers handed out by the > system, as well as throttle down misbehaving AMs (asking for too many > queueable containers). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9428) Add metrics for paused containers in NodeManager
[ https://issues.apache.org/jira/browse/YARN-9428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807082#comment-16807082 ] Abhishek Modi commented on YARN-9428: - Thanks [~giovanni.fumarola] for review. Attached 002 patch with the fixes. Thanks. > Add metrics for paused containers in NodeManager > > > Key: YARN-9428 > URL: https://issues.apache.org/jira/browse/YARN-9428 > Project: Hadoop YARN > Issue Type: Task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9428.001.patch, YARN-9428.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9428) Add metrics for paused containers in NodeManager
[ https://issues.apache.org/jira/browse/YARN-9428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9428: Attachment: YARN-9428.002.patch > Add metrics for paused containers in NodeManager > > > Key: YARN-9428 > URL: https://issues.apache.org/jira/browse/YARN-9428 > Project: Hadoop YARN > Issue Type: Task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9428.001.patch, YARN-9428.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-2889) Limit the number of opportunistic container allocated per AM heartbeat
[ https://issues.apache.org/jira/browse/YARN-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi reassigned YARN-2889: --- Assignee: Abhishek Modi (was: Arun Suresh) > Limit the number of opportunistic container allocated per AM heartbeat > -- > > Key: YARN-2889 > URL: https://issues.apache.org/jira/browse/YARN-2889 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Abhishek Modi >Priority: Major > > We introduce a way to limit the number of queueable requests that each AM can > submit to the LocalRM. > This way we can restrict the number of queueable containers handed out by the > system, as well as throttle down misbehaving AMs (asking for too many > queueable containers). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-2889) Limit in the number of opportunistic container allocated per AM heartbeat
[ https://issues.apache.org/jira/browse/YARN-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-2889: Summary: Limit in the number of opportunistic container allocated per AM heartbeat (was: Limit in the number of opportunistic container requests per AM) > Limit in the number of opportunistic container allocated per AM heartbeat > - > > Key: YARN-2889 > URL: https://issues.apache.org/jira/browse/YARN-2889 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Arun Suresh >Priority: Major > > We introduce a way to limit the number of queueable requests that each AM can > submit to the LocalRM. > This way we can restrict the number of queueable containers handed out by the > system, as well as throttle down misbehaving AMs (asking for too many > queueable containers). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-2889) Limit the number of opportunistic container allocated per AM heartbeat
[ https://issues.apache.org/jira/browse/YARN-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-2889: Summary: Limit the number of opportunistic container allocated per AM heartbeat (was: Limit in the number of opportunistic container allocated per AM heartbeat) > Limit the number of opportunistic container allocated per AM heartbeat > -- > > Key: YARN-2889 > URL: https://issues.apache.org/jira/browse/YARN-2889 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Arun Suresh >Priority: Major > > We introduce a way to limit the number of queueable requests that each AM can > submit to the LocalRM. > This way we can restrict the number of queueable containers handed out by the > system, as well as throttle down misbehaving AMs (asking for too many > queueable containers). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9428) Add metrics for paused containers in NodeManager
[ https://issues.apache.org/jira/browse/YARN-9428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806188#comment-16806188 ] Abhishek Modi commented on YARN-9428: - [~giovanni.fumarola] could you please review it. Thanks. > Add metrics for paused containers in NodeManager > > > Key: YARN-9428 > URL: https://issues.apache.org/jira/browse/YARN-9428 > Project: Hadoop YARN > Issue Type: Task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9428.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9428) Add metrics for paused containers in NodeManager
[ https://issues.apache.org/jira/browse/YARN-9428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9428: Attachment: YARN-9428.001.patch > Add metrics for paused containers in NodeManager > > > Key: YARN-9428 > URL: https://issues.apache.org/jira/browse/YARN-9428 > Project: Hadoop YARN > Issue Type: Task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9428.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9428) Add metrics for paused containers in NodeManager
Abhishek Modi created YARN-9428: --- Summary: Add metrics for paused containers in NodeManager Key: YARN-9428 URL: https://issues.apache.org/jira/browse/YARN-9428 Project: Hadoop YARN Issue Type: Task Reporter: Abhishek Modi Assignee: Abhishek Modi -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9366) Make logs in TimelineClient implementation specific to application
[ https://issues.apache.org/jira/browse/YARN-9366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802875#comment-16802875 ] Abhishek Modi commented on YARN-9366: - Thanks for the patch [~prabham]. In YarnException, you are passing all the timeline entities that were not published. ToString function of TimelineEntities uses toString function of TimelineEntity, but that's not overridden. You need to use dumptimelineRecordsToJson to convert timelineEntity in readable format. I also think, you should log these entities in debug logs only. > Make logs in TimelineClient implementation specific to application > --- > > Key: YARN-9366 > URL: https://issues.apache.org/jira/browse/YARN-9366 > Project: Hadoop YARN > Issue Type: Improvement > Components: ATSv2 >Reporter: Prabha Manepalli >Assignee: Prabha Manepalli >Priority: Minor > Attachments: YARN-9366.v1.patch > > > For every container launched on a NM node, a timeline client is created to > publish entities to the corresponding application's timeline collector. And > there would be multiple timeline clients running at the same time. Current > implementation of timeline client logs are insufficient to isolate publishing > problems related to one application. Hence, creating this Jira to improvise > the logs in TimelineV2ClientImpl. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9382) Publish container killed, paused and resumed events to ATSv2.
[ https://issues.apache.org/jira/browse/YARN-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800150#comment-16800150 ] Abhishek Modi commented on YARN-9382: - Thanks [~vrushalic] for review. I have attached an updated patch with fixes. Thanks. > Publish container killed, paused and resumed events to ATSv2. > - > > Key: YARN-9382 > URL: https://issues.apache.org/jira/browse/YARN-9382 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9382.001.patch, YARN-9382.002.patch > > > There are some events missing in container lifecycle. We need to add support > for adding events for when container gets killed, paused and resumed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9382) Publish container killed, paused and resumed events to ATSv2.
[ https://issues.apache.org/jira/browse/YARN-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9382: Attachment: YARN-9382.002.patch > Publish container killed, paused and resumed events to ATSv2. > - > > Key: YARN-9382 > URL: https://issues.apache.org/jira/browse/YARN-9382 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9382.001.patch, YARN-9382.002.patch > > > There are some events missing in container lifecycle. We need to add support > for adding events for when container gets killed, paused and resumed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-8636) TimelineSchemaCreator command not working
[ https://issues.apache.org/jira/browse/YARN-8636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi resolved YARN-8636. - Resolution: Duplicate > TimelineSchemaCreator command not working > - > > Key: YARN-8636 > URL: https://issues.apache.org/jira/browse/YARN-8636 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Abhishek Modi >Priority: Minor > > {code:java} > nodemanager/bin> ./hadoop > org.apache.yarn.timelineservice.storage.TimelineSchemaCreator -create > Error: Could not find or load main class > org.apache.yarn.timelineservice.storage.TimelineSchemaCreator > {code} > share/hadoop/yarn/timelineservice/ is not part of class path -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9342) Moving log4j1 to log4j2 in hadoop-yarn
[ https://issues.apache.org/jira/browse/YARN-9342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800066#comment-16800066 ] Abhishek Modi commented on YARN-9342: - At Microsoft we have switched to Log4j2 and it has provided significant improvements in performance. > Moving log4j1 to log4j2 in hadoop-yarn > -- > > Key: YARN-9342 > URL: https://issues.apache.org/jira/browse/YARN-9342 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 3.1.2 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > > 1. Log4j2 Asynchronous Logging will give significant improvement in the > performance. > 2. Log4j2 does not have below locking issue which Log4j1 has. > {code} > "Thread-16" #40 daemon prio=5 os_prio=0 tid=0x7f181f9bb800 nid=0x125 > waiting for monitor entry [0x7ef163bab000] >java.lang.Thread.State: BLOCKED (on object monitor) > at org.apache.log4j.Category.callAppenders(Category.java:204) > - locked <0x7ef2d803e2b8> (a org.apache.log4j.spi.RootLogger) > at org.apache.log4j.Category.forcedLog(Category.java:391) > at org.apache.log4j.Category.log(Category.java:856) > at > org.apache.commons.logging.impl.Log4JLogger.info(Log4JLogger.java:176) > {code} > https://bz.apache.org/bugzilla/show_bug.cgi?id=57714 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9402) Opportunistic containers should not be scheduled on Decommissioning nodes.
[ https://issues.apache.org/jira/browse/YARN-9402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798355#comment-16798355 ] Abhishek Modi commented on YARN-9402: - Thanks [~giovanni.fumarola] for review and committing it. > Opportunistic containers should not be scheduled on Decommissioning nodes. > -- > > Key: YARN-9402 > URL: https://issues.apache.org/jira/browse/YARN-9402 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9402.001.patch > > > Right now, opportunistic containers can get scheduled on Decommissioning > nodes which we are draining and thus can lead to killing of those containers > when node is decommissioned. As part of this jira, we will skip allocation of > opportunistic containers on Decommissioning nodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9402) Opportunistic containers should not be scheduled on Decommissioning nodes.
[ https://issues.apache.org/jira/browse/YARN-9402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16797782#comment-16797782 ] Abhishek Modi commented on YARN-9402: - Test failure is unrelated. [~giovanni.fumarola] could you please review it. Thanks. > Opportunistic containers should not be scheduled on Decommissioning nodes. > -- > > Key: YARN-9402 > URL: https://issues.apache.org/jira/browse/YARN-9402 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9402.001.patch > > > Right now, opportunistic containers can get scheduled on Decommissioning > nodes which we are draining and thus can lead to killing of those containers > when node is decommissioned. As part of this jira, we will skip allocation of > opportunistic containers on Decommissioning nodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9339) Apps pending metric incorrect after moving app to a new queue
[ https://issues.apache.org/jira/browse/YARN-9339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16797401#comment-16797401 ] Abhishek Modi commented on YARN-9339: - [~giovanni.fumarola] could you please review it. Thanks. > Apps pending metric incorrect after moving app to a new queue > - > > Key: YARN-9339 > URL: https://issues.apache.org/jira/browse/YARN-9339 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Billie Rinaldi >Assignee: Abhishek Modi >Priority: Minor > Attachments: YARN-9339.001.patch, YARN-9339.002.patch > > > I observed a cluster that had a high Apps Pending count that appeared to be > incorrect. This seemed to be related to apps being moved to different queues. > I tested by adding some logging to TestCapacityScheduler#testMoveAppBasic > before and after a moveApplication call. Before the call appsPending was 1 > and afterwards appsPending was 2. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9335) [atsv2] Restrict the number of elements held in timeline collector when backend is unreachable for async calls
[ https://issues.apache.org/jira/browse/YARN-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9335: Summary: [atsv2] Restrict the number of elements held in timeline collector when backend is unreachable for async calls (was: [atsv2] Restrict the number of elements held in NM timeline collector when backend is unreachable for async calls) > [atsv2] Restrict the number of elements held in timeline collector when > backend is unreachable for async calls > -- > > Key: YARN-9335 > URL: https://issues.apache.org/jira/browse/YARN-9335 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vrushali C >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9335.001.patch, YARN-9335.002.patch, > YARN-9335.003.patch > > > For ATSv2 , if the backend is unreachable, the number/size of data held in > timeline collector's memory increases significantly. This is not good for the > NM memory. > Filing jira to set a limit on how many/much should be retained by the > timeline collector in memory in case the backend is not reachable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9339) Apps pending metric incorrect after moving app to a new queue
[ https://issues.apache.org/jira/browse/YARN-9339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16797107#comment-16797107 ] Abhishek Modi commented on YARN-9339: - Checkstyle warning is due to redundant public modifier, but that is across the file and nothing much could be done. > Apps pending metric incorrect after moving app to a new queue > - > > Key: YARN-9339 > URL: https://issues.apache.org/jira/browse/YARN-9339 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Billie Rinaldi >Assignee: Abhishek Modi >Priority: Minor > Attachments: YARN-9339.001.patch, YARN-9339.002.patch > > > I observed a cluster that had a high Apps Pending count that appeared to be > incorrect. This seemed to be related to apps being moved to different queues. > I tested by adding some logging to TestCapacityScheduler#testMoveAppBasic > before and after a moveApplication call. Before the call appsPending was 1 > and afterwards appsPending was 2. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9402) Opportunistic containers should not be scheduled on Decommissioning nodes.
Abhishek Modi created YARN-9402: --- Summary: Opportunistic containers should not be scheduled on Decommissioning nodes. Key: YARN-9402 URL: https://issues.apache.org/jira/browse/YARN-9402 Project: Hadoop YARN Issue Type: Sub-task Reporter: Abhishek Modi Assignee: Abhishek Modi Right now, opportunistic containers can get scheduled on Decommissioning nodes which we are draining and thus can lead to killing of those containers when node is decommissioned. As part of this jira, we will skip allocation of opportunistic containers on Decommissioning nodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9339) Apps pending metric incorrect after moving app to a new queue
[ https://issues.apache.org/jira/browse/YARN-9339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9339: Attachment: YARN-9339.002.patch > Apps pending metric incorrect after moving app to a new queue > - > > Key: YARN-9339 > URL: https://issues.apache.org/jira/browse/YARN-9339 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Billie Rinaldi >Assignee: Abhishek Modi >Priority: Minor > Attachments: YARN-9339.001.patch, YARN-9339.002.patch > > > I observed a cluster that had a high Apps Pending count that appeared to be > incorrect. This seemed to be related to apps being moved to different queues. > I tested by adding some logging to TestCapacityScheduler#testMoveAppBasic > before and after a moveApplication call. Before the call appsPending was 1 > and afterwards appsPending was 2. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9392) Handle missing scheduler events in Opportunistic Scheduler.
[ https://issues.apache.org/jira/browse/YARN-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16796705#comment-16796705 ] Abhishek Modi commented on YARN-9392: - Thanks [~giovanni.fumarola] for review and committing it. > Handle missing scheduler events in Opportunistic Scheduler. > --- > > Key: YARN-9392 > URL: https://issues.apache.org/jira/browse/YARN-9392 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9392.001.patch > > > At present newly added scheduler events are not being ignored by > Opportunistic scheduler causing error messages in logs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9390) Add support for configurable Resource Calculator in Opportunistic Scheduler.
[ https://issues.apache.org/jira/browse/YARN-9390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16796704#comment-16796704 ] Abhishek Modi commented on YARN-9390: - Thanks [~giovanni.fumarola] for review. Right now, in capacity scheduler there is a configuration to change resource calculator, but there is no way to change it in Opportunistic scheduling. DefaultResourceCalculator is performant as compared to DominantResourceCalculator, as it doesn't compare other resource types. The idea is to provide flexibility and an option to the users using capacity scheduler with DefaultResourceCalculator to use opportunistic scheduling with same resource calculator. > Add support for configurable Resource Calculator in Opportunistic Scheduler. > > > Key: YARN-9390 > URL: https://issues.apache.org/jira/browse/YARN-9390 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9390.001.patch > > > Right now, Opportunistic scheduler uses hard coded DominantResourceCalculator > and there is no option to change it to other resource calculators. This Jira > is to make resource calculator configurable for Opportunistic scheduler. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9335) [atsv2] Restrict the number of elements held in NM timeline collector when backend is unreachable for async calls
[ https://issues.apache.org/jira/browse/YARN-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9335: Attachment: YARN-9335.003.patch > [atsv2] Restrict the number of elements held in NM timeline collector when > backend is unreachable for async calls > - > > Key: YARN-9335 > URL: https://issues.apache.org/jira/browse/YARN-9335 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vrushali C >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9335.001.patch, YARN-9335.002.patch, > YARN-9335.003.patch > > > For ATSv2 , if the backend is unreachable, the number/size of data held in > timeline collector's memory increases significantly. This is not good for the > NM memory. > Filing jira to set a limit on how many/much should be retained by the > timeline collector in memory in case the backend is not reachable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9335) [atsv2] Restrict the number of elements held in NM timeline collector when backend is unreachable for async calls
[ https://issues.apache.org/jira/browse/YARN-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16796142#comment-16796142 ] Abhishek Modi commented on YARN-9335: - Thanks [~sadineni] for pointing it out. Attached 003 patch with the fix. > [atsv2] Restrict the number of elements held in NM timeline collector when > backend is unreachable for async calls > - > > Key: YARN-9335 > URL: https://issues.apache.org/jira/browse/YARN-9335 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vrushali C >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9335.001.patch, YARN-9335.002.patch, > YARN-9335.003.patch > > > For ATSv2 , if the backend is unreachable, the number/size of data held in > timeline collector's memory increases significantly. This is not good for the > NM memory. > Filing jira to set a limit on how many/much should be retained by the > timeline collector in memory in case the backend is not reachable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9392) Handle missing scheduler events in Opportunistic Scheduler.
[ https://issues.apache.org/jira/browse/YARN-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9392: Attachment: YARN-9392.001.patch > Handle missing scheduler events in Opportunistic Scheduler. > --- > > Key: YARN-9392 > URL: https://issues.apache.org/jira/browse/YARN-9392 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9392.001.patch > > > At present newly added scheduler events are not being ignored by > Opportunistic scheduler causing error messages in logs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-5414) Integrate NodeQueueLoadMonitor with ClusterNodeTracker
[ https://issues.apache.org/jira/browse/YARN-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi reassigned YARN-5414: --- Assignee: Abhishek Modi (was: Arun Suresh) > Integrate NodeQueueLoadMonitor with ClusterNodeTracker > -- > > Key: YARN-5414 > URL: https://issues.apache.org/jira/browse/YARN-5414 > Project: Hadoop YARN > Issue Type: Sub-task > Components: container-queuing, distributed-scheduling, scheduler >Reporter: Arun Suresh >Assignee: Abhishek Modi >Priority: Major > > The {{ClusterNodeTracker}} tracks the states of clusterNodes and provides > convenience methods like sort and filter. > The {{NodeQueueLoadMonitor}} should use the {{ClusterNodeTracker}} instead of > maintaining its own data-structure of node information. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9392) Handle missing scheduler events in Opportunistic Scheduler.
Abhishek Modi created YARN-9392: --- Summary: Handle missing scheduler events in Opportunistic Scheduler. Key: YARN-9392 URL: https://issues.apache.org/jira/browse/YARN-9392 Project: Hadoop YARN Issue Type: Sub-task Reporter: Abhishek Modi Assignee: Abhishek Modi At present newly added scheduler events are not being ignored by Opportunistic scheduler causing error messages in logs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9390) Add support for configurable Resource Calculator in Opportunistic Scheduler.
[ https://issues.apache.org/jira/browse/YARN-9390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793706#comment-16793706 ] Abhishek Modi commented on YARN-9390: - [~giovanni.fumarola] could you please review it. Thanks. > Add support for configurable Resource Calculator in Opportunistic Scheduler. > > > Key: YARN-9390 > URL: https://issues.apache.org/jira/browse/YARN-9390 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9390.001.patch > > > Right now, Opportunistic scheduler uses hard coded DominantResourceCalculator > and there is no option to change it to other resource calculators. This Jira > is to make resource calculator configurable for Opportunistic scheduler. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9390) Add support for configurable Resource Calculator in Opportunistic Scheduler.
Abhishek Modi created YARN-9390: --- Summary: Add support for configurable Resource Calculator in Opportunistic Scheduler. Key: YARN-9390 URL: https://issues.apache.org/jira/browse/YARN-9390 Project: Hadoop YARN Issue Type: Sub-task Reporter: Abhishek Modi Assignee: Abhishek Modi Right now, Opportunistic scheduler uses hard coded DominantResourceCalculator and there is no option to change it to other resource calculators. This Jira is to make resource calculator configurable for Opportunistic scheduler. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9335) [atsv2] Restrict the number of elements held in NM timeline collector when backend is unreachable for async calls
[ https://issues.apache.org/jira/browse/YARN-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16792921#comment-16792921 ] Abhishek Modi commented on YARN-9335: - [~vrushalic] [~rohithsharma] could you please review this patch. Thanks. > [atsv2] Restrict the number of elements held in NM timeline collector when > backend is unreachable for async calls > - > > Key: YARN-9335 > URL: https://issues.apache.org/jira/browse/YARN-9335 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vrushali C >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9335.001.patch, YARN-9335.002.patch > > > For ATSv2 , if the backend is unreachable, the number/size of data held in > timeline collector's memory increases significantly. This is not good for the > NM memory. > Filing jira to set a limit on how many/much should be retained by the > timeline collector in memory in case the backend is not reachable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3488) AM get timeline service info from RM rather than Application specific configuration.
[ https://issues.apache.org/jira/browse/YARN-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791383#comment-16791383 ] Abhishek Modi commented on YARN-3488: - [~rohithsharma] [~vrushalic] could you please review and commit it whenever you get some time. Thanks. > AM get timeline service info from RM rather than Application specific > configuration. > > > Key: YARN-3488 > URL: https://issues.apache.org/jira/browse/YARN-3488 > Project: Hadoop YARN > Issue Type: Sub-task > Components: applications >Reporter: Junping Du >Assignee: Abhishek Modi >Priority: Major > Labels: YARN-5355 > Attachments: YARN-3488.001.patch, YARN-3488.002.patch, > YARN-3488.003.patch > > > Since v1 timeline service, we have MR configuration to enable/disable putting > history event to timeline service. For today's v2 timeline service ongoing > effort, currently we have different methods/structures between v1 and v2 for > consuming TimelineClient, so application have to be aware of which version > timeline service get used there. > There are basically two options here: > First option is as current way in DistributedShell or MR to let application > has specific configuration to point out that if enabling ATS and which > version could be, like: MRJobConfig.MAPREDUCE_JOB_EMIT_TIMELINE_DATA, etc. > The other option is to let application to figure out timeline related info > from YARN/RM, it can be done through registerApplicationMaster() in > ApplicationMasterProtocol with return value for service "off", "v1_on", or > "v2_on". > We prefer the latter option because application owner doesn't have to aware > RM/YARN infrastructure details. Please note that we should keep compatible > (consistent behavior with the same setting) with released configurations. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9338) Timeline related testcases are failing
[ https://issues.apache.org/jira/browse/YARN-9338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790845#comment-16790845 ] Abhishek Modi commented on YARN-9338: - Thanks [~vrushalic]. I have ran all tests locally and they are running fine. > Timeline related testcases are failing > -- > > Key: YARN-9338 > URL: https://issues.apache.org/jira/browse/YARN-9338 > Project: Hadoop YARN > Issue Type: Test >Reporter: Prabhu Joseph >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9338.001.patch, YARN-9338.002.patch, > YARN-9338.003.patch, YARN-9338.004.patch > > > Timeline related testcases are failing > {code} > [ERROR] Failures: > [ERROR] > TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV2Enabled:262->runTest:245->validateV2:382->verifyEntity:417 > Expected 2 events to be published expected:<2> but was:<1> > [ERROR] > TestSystemMetricsPublisherForV2.testPublishAppAttemptMetrics:259->verifyEntity:332 > Expected 2 events to be published expected:<2> but was:<1> > [ERROR] > TestSystemMetricsPublisherForV2.testPublishApplicationMetrics:224->verifyEntity:332 > Expected 4 events to be published expected:<4> but was:<1> > [ERROR] > TestSystemMetricsPublisherForV2.testPublishContainerMetrics:291->verifyEntity:332 > Expected 2 events to be published expected:<2> but was:<1> > [ERROR] Errors: > [ERROR] > TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV1V2Enabled:252->runTest:242->testSetup:123 > » YarnRuntime > [ERROR] Failures: > [ERROR] > TestTimelineAuthFilterForV2.testPutTimelineEntities:343->access$000:87->publishAndVerifyEntity:307 > expected:<1> but was:<2> > [ERROR] > TestTimelineAuthFilterForV2.testPutTimelineEntities:352->publishWithRetries:320->publishAndVerifyEntity:307 > expected:<1> but was:<2> > [ERROR] > TestTimelineAuthFilterForV2.testPutTimelineEntities:352->publishWithRetries:320->publishAndVerifyEntity:307 > expected:<1> but was:<2> > [ERROR] > TestTimelineAuthFilterForV2.testPutTimelineEntities:343->access$000:87->publishAndVerifyEntity:307 > expected:<1> but was:<2> > [INFO] > [ERROR] Failures: > [ERROR] > TestDistributedShell.testDSShellWithoutDomainV2:313->testDSShell:317->testDSShell:458->checkTimelineV2:557->verifyEntityForTimelineV2:710 > Unexpected number of DS_APP_ATTEMPT_START event published. expected:<1> but > was:<0> > [ERROR] > TestDistributedShell.testDSShellWithoutDomainV2CustomizedFlow:329->testDSShell:458->checkTimelineV2:557->verifyEntityForTimelineV2:710 > Unexpected number of DS_APP_ATTEMPT_START event published. expected:<1> but > was:<0> > [ERROR] > TestDistributedShell.testDSShellWithoutDomainV2DefaultFlow:323->testDSShell:458->checkTimelineV2:557->verifyEntityForTimelineV2:710 > Unexpected number of DS_APP_ATTEMPT_START event published. expected:<1> but > was:<0> > [ERROR] Failures: > [ERROR] > TestMRTimelineEventHandling.testMRNewTimelineServiceEventHandling:240->checkNewTimelineEvent:304->verifyEntity:462 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9381) The yarn-default.xml has two identical property named yarn.timeline-service.http-cross-origin.enabled
[ https://issues.apache.org/jira/browse/YARN-9381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790840#comment-16790840 ] Abhishek Modi commented on YARN-9381: - [~cheersyang] [~giovanni.fumarola] could you please review this. > The yarn-default.xml has two identical property named > yarn.timeline-service.http-cross-origin.enabled > - > > Key: YARN-9381 > URL: https://issues.apache.org/jira/browse/YARN-9381 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.1.0, 3.2.0, 3.1.1, 3.1.2 >Reporter: jenny >Assignee: Abhishek Modi >Priority: Trivial > Attachments: YARN-9381.001.patch, image-2019-03-12-16-51-19-748.png > > > The yarn-default.xml file has two identical property named > yarn.timeline-service.http-cross-origin.enabled. > !image-2019-03-12-16-51-19-748.png|width=298,height=98! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-9381) The yarn-default.xml has two identical property named yarn.timeline-service.http-cross-origin.enabled
[ https://issues.apache.org/jira/browse/YARN-9381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi reassigned YARN-9381: --- Assignee: Abhishek Modi > The yarn-default.xml has two identical property named > yarn.timeline-service.http-cross-origin.enabled > - > > Key: YARN-9381 > URL: https://issues.apache.org/jira/browse/YARN-9381 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.1.0, 3.2.0, 3.1.1, 3.1.2 >Reporter: jenny >Assignee: Abhishek Modi >Priority: Trivial > Attachments: image-2019-03-12-16-51-19-748.png > > > The yarn-default.xml file has two identical property named > yarn.timeline-service.http-cross-origin.enabled. > !image-2019-03-12-16-51-19-748.png|width=298,height=98! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9335) [atsv2] Restrict the number of elements held in NM timeline collector when backend is unreachable for async calls
[ https://issues.apache.org/jira/browse/YARN-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9335: Attachment: YARN-9335.002.patch > [atsv2] Restrict the number of elements held in NM timeline collector when > backend is unreachable for async calls > - > > Key: YARN-9335 > URL: https://issues.apache.org/jira/browse/YARN-9335 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vrushali C >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9335.001.patch, YARN-9335.002.patch > > > For ATSv2 , if the backend is unreachable, the number/size of data held in > timeline collector's memory increases significantly. This is not good for the > NM memory. > Filing jira to set a limit on how many/much should be retained by the > timeline collector in memory in case the backend is not reachable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9383) Publish federation events to ATSv2.
Abhishek Modi created YARN-9383: --- Summary: Publish federation events to ATSv2. Key: YARN-9383 URL: https://issues.apache.org/jira/browse/YARN-9383 Project: Hadoop YARN Issue Type: Sub-task Reporter: Abhishek Modi Assignee: Abhishek Modi With federation enabled, containers for a single application might get spawned across multiple sub-clusters. This information right now is not getting published to ATSv2. As part of this jira, we are going to publish federation related info in container events to ATSv2. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9382) Publish container killed, paused and resumed events to ATSv2.
Abhishek Modi created YARN-9382: --- Summary: Publish container killed, paused and resumed events to ATSv2. Key: YARN-9382 URL: https://issues.apache.org/jira/browse/YARN-9382 Project: Hadoop YARN Issue Type: Sub-task Reporter: Abhishek Modi Assignee: Abhishek Modi There are some events missing in container lifecycle. We need to add support for adding events for when container gets killed, paused and resumed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9338) Timeline related testcases are failing
[ https://issues.apache.org/jira/browse/YARN-9338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9338: Attachment: YARN-9338.004.patch > Timeline related testcases are failing > -- > > Key: YARN-9338 > URL: https://issues.apache.org/jira/browse/YARN-9338 > Project: Hadoop YARN > Issue Type: Test >Reporter: Prabhu Joseph >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9338.001.patch, YARN-9338.002.patch, > YARN-9338.003.patch, YARN-9338.004.patch > > > Timeline related testcases are failing > {code} > [ERROR] Failures: > [ERROR] > TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV2Enabled:262->runTest:245->validateV2:382->verifyEntity:417 > Expected 2 events to be published expected:<2> but was:<1> > [ERROR] > TestSystemMetricsPublisherForV2.testPublishAppAttemptMetrics:259->verifyEntity:332 > Expected 2 events to be published expected:<2> but was:<1> > [ERROR] > TestSystemMetricsPublisherForV2.testPublishApplicationMetrics:224->verifyEntity:332 > Expected 4 events to be published expected:<4> but was:<1> > [ERROR] > TestSystemMetricsPublisherForV2.testPublishContainerMetrics:291->verifyEntity:332 > Expected 2 events to be published expected:<2> but was:<1> > [ERROR] Errors: > [ERROR] > TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV1V2Enabled:252->runTest:242->testSetup:123 > » YarnRuntime > [ERROR] Failures: > [ERROR] > TestTimelineAuthFilterForV2.testPutTimelineEntities:343->access$000:87->publishAndVerifyEntity:307 > expected:<1> but was:<2> > [ERROR] > TestTimelineAuthFilterForV2.testPutTimelineEntities:352->publishWithRetries:320->publishAndVerifyEntity:307 > expected:<1> but was:<2> > [ERROR] > TestTimelineAuthFilterForV2.testPutTimelineEntities:352->publishWithRetries:320->publishAndVerifyEntity:307 > expected:<1> but was:<2> > [ERROR] > TestTimelineAuthFilterForV2.testPutTimelineEntities:343->access$000:87->publishAndVerifyEntity:307 > expected:<1> but was:<2> > [INFO] > [ERROR] Failures: > [ERROR] > TestDistributedShell.testDSShellWithoutDomainV2:313->testDSShell:317->testDSShell:458->checkTimelineV2:557->verifyEntityForTimelineV2:710 > Unexpected number of DS_APP_ATTEMPT_START event published. expected:<1> but > was:<0> > [ERROR] > TestDistributedShell.testDSShellWithoutDomainV2CustomizedFlow:329->testDSShell:458->checkTimelineV2:557->verifyEntityForTimelineV2:710 > Unexpected number of DS_APP_ATTEMPT_START event published. expected:<1> but > was:<0> > [ERROR] > TestDistributedShell.testDSShellWithoutDomainV2DefaultFlow:323->testDSShell:458->checkTimelineV2:557->verifyEntityForTimelineV2:710 > Unexpected number of DS_APP_ATTEMPT_START event published. expected:<1> but > was:<0> > [ERROR] Failures: > [ERROR] > TestMRTimelineEventHandling.testMRNewTimelineServiceEventHandling:240->checkNewTimelineEvent:304->verifyEntity:462 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9335) [atsv2] Restrict the number of elements held in NM timeline collector when backend is unreachable for async calls
[ https://issues.apache.org/jira/browse/YARN-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9335: Summary: [atsv2] Restrict the number of elements held in NM timeline collector when backend is unreachable for async calls (was: [atsv2] Restrict the number of elements held in NM timeline collector when backend is unreachable for asycn calls) > [atsv2] Restrict the number of elements held in NM timeline collector when > backend is unreachable for async calls > - > > Key: YARN-9335 > URL: https://issues.apache.org/jira/browse/YARN-9335 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vrushali C >Assignee: Abhishek Modi >Priority: Major > > For ATSv2 , if the backend is unreachable, the number/size of data held in > timeline collector's memory increases significantly. This is not good for the > NM memory. > Filing jira to set a limit on how many/much should be retained by the > timeline collector in memory in case the backend is not reachable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9335) [atsv2] Restrict the number of elements held in NM timeline collector when backend is unreachable for asycn calls
[ https://issues.apache.org/jira/browse/YARN-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788667#comment-16788667 ] Abhishek Modi commented on YARN-9335: - Sure [~Prabhu Joseph]. You can take over sync writes part. I will attach a patch for async one. Thanks. > [atsv2] Restrict the number of elements held in NM timeline collector when > backend is unreachable for asycn calls > - > > Key: YARN-9335 > URL: https://issues.apache.org/jira/browse/YARN-9335 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vrushali C >Assignee: Abhishek Modi >Priority: Major > > For ATSv2 , if the backend is unreachable, the number/size of data held in > timeline collector's memory increases significantly. This is not good for the > NM memory. > Filing jira to set a limit on how many/much should be retained by the > timeline collector in memory in case the backend is not reachable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9335) [atsv2] Restrict the number of elements held in NM timeline collector when backend is unreachable for asycn calls
[ https://issues.apache.org/jira/browse/YARN-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9335: Summary: [atsv2] Restrict the number of elements held in NM timeline collector when backend is unreachable for asycn calls (was: [atsv2] Restrict the number of elements held in NM timeline collector when backend is unreachable) > [atsv2] Restrict the number of elements held in NM timeline collector when > backend is unreachable for asycn calls > - > > Key: YARN-9335 > URL: https://issues.apache.org/jira/browse/YARN-9335 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vrushali C >Assignee: Abhishek Modi >Priority: Major > > For ATSv2 , if the backend is unreachable, the number/size of data held in > timeline collector's memory increases significantly. This is not good for the > NM memory. > Filing jira to set a limit on how many/much should be retained by the > timeline collector in memory in case the backend is not reachable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9335) [atsv2] Restrict the number of elements held in NM timeline collector when backend is unreachable
[ https://issues.apache.org/jira/browse/YARN-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787541#comment-16787541 ] Abhishek Modi commented on YARN-9335: - There are two major issues right now. Hbase client has a huge retry time out which causes threads to get blocked at write entities for async writes. For sync writes, threads get blocked at synchronized blocks and that will bloat up the event queue causing huge memory pressure on NM as well as delay in processing of other events. > [atsv2] Restrict the number of elements held in NM timeline collector when > backend is unreachable > - > > Key: YARN-9335 > URL: https://issues.apache.org/jira/browse/YARN-9335 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vrushali C >Assignee: Abhishek Modi >Priority: Major > > For ATSv2 , if the backend is unreachable, the number/size of data held in > timeline collector's memory increases significantly. This is not good for the > NM memory. > Filing jira to set a limit on how many/much should be retained by the > timeline collector in memory in case the backend is not reachable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8218) Add application launch time to ATSV1
[ https://issues.apache.org/jira/browse/YARN-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787006#comment-16787006 ] Abhishek Modi commented on YARN-8218: - Thanks [~vrushalic] for review and committing it. > Add application launch time to ATSV1 > > > Key: YARN-8218 > URL: https://issues.apache.org/jira/browse/YARN-8218 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Kanwaljeet Sachdev >Assignee: Abhishek Modi >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-8218.001.patch > > > YARN-7088 publishes application launch time to RMStore and also adds it to > the YARN UI. It would be a nice enhancement to have the launchTime event > published into the Application history server as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8218) Add application launch time to ATSV1
[ https://issues.apache.org/jira/browse/YARN-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785751#comment-16785751 ] Abhishek Modi commented on YARN-8218: - Gentle reminder [~vrushalic]. Thanks. > Add application launch time to ATSV1 > > > Key: YARN-8218 > URL: https://issues.apache.org/jira/browse/YARN-8218 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Kanwaljeet Sachdev >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-8218.001.patch > > > YARN-7088 publishes application launch time to RMStore and also adds it to > the YARN UI. It would be a nice enhancement to have the launchTime event > published into the Application history server as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3488) AM get timeline service info from RM rather than Application specific configuration.
[ https://issues.apache.org/jira/browse/YARN-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785746#comment-16785746 ] Abhishek Modi commented on YARN-3488: - Gentle reminder [~rohithsharma] [~vrushalic] > AM get timeline service info from RM rather than Application specific > configuration. > > > Key: YARN-3488 > URL: https://issues.apache.org/jira/browse/YARN-3488 > Project: Hadoop YARN > Issue Type: Sub-task > Components: applications >Reporter: Junping Du >Assignee: Abhishek Modi >Priority: Major > Labels: YARN-5355 > Attachments: YARN-3488.001.patch, YARN-3488.002.patch, > YARN-3488.003.patch > > > Since v1 timeline service, we have MR configuration to enable/disable putting > history event to timeline service. For today's v2 timeline service ongoing > effort, currently we have different methods/structures between v1 and v2 for > consuming TimelineClient, so application have to be aware of which version > timeline service get used there. > There are basically two options here: > First option is as current way in DistributedShell or MR to let application > has specific configuration to point out that if enabling ATS and which > version could be, like: MRJobConfig.MAPREDUCE_JOB_EMIT_TIMELINE_DATA, etc. > The other option is to let application to figure out timeline related info > from YARN/RM, it can be done through registerApplicationMaster() in > ApplicationMasterProtocol with return value for service "off", "v1_on", or > "v2_on". > We prefer the latter option because application owner doesn't have to aware > RM/YARN infrastructure details. Please note that we should keep compatible > (consistent behavior with the same setting) with released configurations. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9338) Timeline related testcases are failing
[ https://issues.apache.org/jira/browse/YARN-9338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9338: Attachment: YARN-9338.003.patch > Timeline related testcases are failing > -- > > Key: YARN-9338 > URL: https://issues.apache.org/jira/browse/YARN-9338 > Project: Hadoop YARN > Issue Type: Test >Reporter: Prabhu Joseph >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9338.001.patch, YARN-9338.002.patch, > YARN-9338.003.patch > > > Timeline related testcases are failing > {code} > [ERROR] Failures: > [ERROR] > TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV2Enabled:262->runTest:245->validateV2:382->verifyEntity:417 > Expected 2 events to be published expected:<2> but was:<1> > [ERROR] > TestSystemMetricsPublisherForV2.testPublishAppAttemptMetrics:259->verifyEntity:332 > Expected 2 events to be published expected:<2> but was:<1> > [ERROR] > TestSystemMetricsPublisherForV2.testPublishApplicationMetrics:224->verifyEntity:332 > Expected 4 events to be published expected:<4> but was:<1> > [ERROR] > TestSystemMetricsPublisherForV2.testPublishContainerMetrics:291->verifyEntity:332 > Expected 2 events to be published expected:<2> but was:<1> > [ERROR] Errors: > [ERROR] > TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV1V2Enabled:252->runTest:242->testSetup:123 > » YarnRuntime > [ERROR] Failures: > [ERROR] > TestTimelineAuthFilterForV2.testPutTimelineEntities:343->access$000:87->publishAndVerifyEntity:307 > expected:<1> but was:<2> > [ERROR] > TestTimelineAuthFilterForV2.testPutTimelineEntities:352->publishWithRetries:320->publishAndVerifyEntity:307 > expected:<1> but was:<2> > [ERROR] > TestTimelineAuthFilterForV2.testPutTimelineEntities:352->publishWithRetries:320->publishAndVerifyEntity:307 > expected:<1> but was:<2> > [ERROR] > TestTimelineAuthFilterForV2.testPutTimelineEntities:343->access$000:87->publishAndVerifyEntity:307 > expected:<1> but was:<2> > [INFO] > [ERROR] Failures: > [ERROR] > TestDistributedShell.testDSShellWithoutDomainV2:313->testDSShell:317->testDSShell:458->checkTimelineV2:557->verifyEntityForTimelineV2:710 > Unexpected number of DS_APP_ATTEMPT_START event published. expected:<1> but > was:<0> > [ERROR] > TestDistributedShell.testDSShellWithoutDomainV2CustomizedFlow:329->testDSShell:458->checkTimelineV2:557->verifyEntityForTimelineV2:710 > Unexpected number of DS_APP_ATTEMPT_START event published. expected:<1> but > was:<0> > [ERROR] > TestDistributedShell.testDSShellWithoutDomainV2DefaultFlow:323->testDSShell:458->checkTimelineV2:557->verifyEntityForTimelineV2:710 > Unexpected number of DS_APP_ATTEMPT_START event published. expected:<1> but > was:<0> > [ERROR] Failures: > [ERROR] > TestMRTimelineEventHandling.testMRNewTimelineServiceEventHandling:240->checkNewTimelineEvent:304->verifyEntity:462 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9338) Timeline related testcases are failing
[ https://issues.apache.org/jira/browse/YARN-9338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782734#comment-16782734 ] Abhishek Modi commented on YARN-9338: - Thanks [~Prabhu Joseph] for pointint it out. Attached YARN-9338.002.patch to fix this. > Timeline related testcases are failing > -- > > Key: YARN-9338 > URL: https://issues.apache.org/jira/browse/YARN-9338 > Project: Hadoop YARN > Issue Type: Test >Reporter: Prabhu Joseph >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9338.001.patch, YARN-9338.002.patch > > > Timeline related testcases are failing > {code} > [ERROR] Failures: > [ERROR] > TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV2Enabled:262->runTest:245->validateV2:382->verifyEntity:417 > Expected 2 events to be published expected:<2> but was:<1> > [ERROR] > TestSystemMetricsPublisherForV2.testPublishAppAttemptMetrics:259->verifyEntity:332 > Expected 2 events to be published expected:<2> but was:<1> > [ERROR] > TestSystemMetricsPublisherForV2.testPublishApplicationMetrics:224->verifyEntity:332 > Expected 4 events to be published expected:<4> but was:<1> > [ERROR] > TestSystemMetricsPublisherForV2.testPublishContainerMetrics:291->verifyEntity:332 > Expected 2 events to be published expected:<2> but was:<1> > [ERROR] Errors: > [ERROR] > TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV1V2Enabled:252->runTest:242->testSetup:123 > » YarnRuntime > [ERROR] Failures: > [ERROR] > TestTimelineAuthFilterForV2.testPutTimelineEntities:343->access$000:87->publishAndVerifyEntity:307 > expected:<1> but was:<2> > [ERROR] > TestTimelineAuthFilterForV2.testPutTimelineEntities:352->publishWithRetries:320->publishAndVerifyEntity:307 > expected:<1> but was:<2> > [ERROR] > TestTimelineAuthFilterForV2.testPutTimelineEntities:352->publishWithRetries:320->publishAndVerifyEntity:307 > expected:<1> but was:<2> > [ERROR] > TestTimelineAuthFilterForV2.testPutTimelineEntities:343->access$000:87->publishAndVerifyEntity:307 > expected:<1> but was:<2> > [INFO] > [ERROR] Failures: > [ERROR] > TestDistributedShell.testDSShellWithoutDomainV2:313->testDSShell:317->testDSShell:458->checkTimelineV2:557->verifyEntityForTimelineV2:710 > Unexpected number of DS_APP_ATTEMPT_START event published. expected:<1> but > was:<0> > [ERROR] > TestDistributedShell.testDSShellWithoutDomainV2CustomizedFlow:329->testDSShell:458->checkTimelineV2:557->verifyEntityForTimelineV2:710 > Unexpected number of DS_APP_ATTEMPT_START event published. expected:<1> but > was:<0> > [ERROR] > TestDistributedShell.testDSShellWithoutDomainV2DefaultFlow:323->testDSShell:458->checkTimelineV2:557->verifyEntityForTimelineV2:710 > Unexpected number of DS_APP_ATTEMPT_START event published. expected:<1> but > was:<0> > [ERROR] Failures: > [ERROR] > TestMRTimelineEventHandling.testMRNewTimelineServiceEventHandling:240->checkNewTimelineEvent:304->verifyEntity:462 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9338) Timeline related testcases are failing
[ https://issues.apache.org/jira/browse/YARN-9338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9338: Attachment: YARN-9338.002.patch > Timeline related testcases are failing > -- > > Key: YARN-9338 > URL: https://issues.apache.org/jira/browse/YARN-9338 > Project: Hadoop YARN > Issue Type: Test >Reporter: Prabhu Joseph >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9338.001.patch, YARN-9338.002.patch > > > Timeline related testcases are failing > {code} > [ERROR] Failures: > [ERROR] > TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV2Enabled:262->runTest:245->validateV2:382->verifyEntity:417 > Expected 2 events to be published expected:<2> but was:<1> > [ERROR] > TestSystemMetricsPublisherForV2.testPublishAppAttemptMetrics:259->verifyEntity:332 > Expected 2 events to be published expected:<2> but was:<1> > [ERROR] > TestSystemMetricsPublisherForV2.testPublishApplicationMetrics:224->verifyEntity:332 > Expected 4 events to be published expected:<4> but was:<1> > [ERROR] > TestSystemMetricsPublisherForV2.testPublishContainerMetrics:291->verifyEntity:332 > Expected 2 events to be published expected:<2> but was:<1> > [ERROR] Errors: > [ERROR] > TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV1V2Enabled:252->runTest:242->testSetup:123 > » YarnRuntime > [ERROR] Failures: > [ERROR] > TestTimelineAuthFilterForV2.testPutTimelineEntities:343->access$000:87->publishAndVerifyEntity:307 > expected:<1> but was:<2> > [ERROR] > TestTimelineAuthFilterForV2.testPutTimelineEntities:352->publishWithRetries:320->publishAndVerifyEntity:307 > expected:<1> but was:<2> > [ERROR] > TestTimelineAuthFilterForV2.testPutTimelineEntities:352->publishWithRetries:320->publishAndVerifyEntity:307 > expected:<1> but was:<2> > [ERROR] > TestTimelineAuthFilterForV2.testPutTimelineEntities:343->access$000:87->publishAndVerifyEntity:307 > expected:<1> but was:<2> > [INFO] > [ERROR] Failures: > [ERROR] > TestDistributedShell.testDSShellWithoutDomainV2:313->testDSShell:317->testDSShell:458->checkTimelineV2:557->verifyEntityForTimelineV2:710 > Unexpected number of DS_APP_ATTEMPT_START event published. expected:<1> but > was:<0> > [ERROR] > TestDistributedShell.testDSShellWithoutDomainV2CustomizedFlow:329->testDSShell:458->checkTimelineV2:557->verifyEntityForTimelineV2:710 > Unexpected number of DS_APP_ATTEMPT_START event published. expected:<1> but > was:<0> > [ERROR] > TestDistributedShell.testDSShellWithoutDomainV2DefaultFlow:323->testDSShell:458->checkTimelineV2:557->verifyEntityForTimelineV2:710 > Unexpected number of DS_APP_ATTEMPT_START event published. expected:<1> but > was:<0> > [ERROR] Failures: > [ERROR] > TestMRTimelineEventHandling.testMRNewTimelineServiceEventHandling:240->checkNewTimelineEvent:304->verifyEntity:462 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org