[jira] [Commented] (YARN-6791) Add a new acl control to make YARN acl control perfect
[ https://issues.apache.org/jira/browse/YARN-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16083450#comment-16083450 ] daemon commented on YARN-6791: -- [~templedf] hi, templedf. Can you help me review code? pretty thanks! > Add a new acl control to make YARN acl control perfect > -- > > Key: YARN-6791 > URL: https://issues.apache.org/jira/browse/YARN-6791 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon > Fix For: 2.7.2 > > Attachments: screenshot-1.png, screenshot-2.png, YARN-6791.001.patch, > YARN-6791.002.patch > > > The yarn application acl control is not so perfect which could let us pretty > confused sometimes. > !screenshot-1.png! > The yarn acl is disabled default, but when we enable it. > You will find the others who neither is yarn admin nor queue admin can not > view the application detail. > It will show something as blow in YARN RM web ui: > !screenshot-2.png! > So when we enable those configs, you will find it is very inconvenient to > view the application status > when use RM web ui. > There are two ways to solve the problem: > 1. To make the web ui more perfect, and can allow user to login as any uses > he want. > Besides, it should provide a perfect verification mechanism. > 2. Add a config in YarnConfiguration, which can allow some uses view any > applications he want. > But he has not modify permissions. > In this way, the user can view all applications but can not kill other users > applications. > Of the above two solutions, I will choose the second solution. > It is low cost but is more useful. > I will work on this, and upload the patch later. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6769) Put the no demand queue after the most in FairSharePolicy#compare
[ https://issues.apache.org/jira/browse/YARN-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086624#comment-16086624 ] daemon commented on YARN-6769: -- [~yufeigu], Thanks yufei. I really name is zhouyunfan. Thank you so mush for doing so mush for me! > Put the no demand queue after the most in FairSharePolicy#compare > - > > Key: YARN-6769 > URL: https://issues.apache.org/jira/browse/YARN-6769 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon >Priority: Minor > Fix For: 2.9.0 > > Attachments: YARN-6769.001.patch, YARN-6769.002.patch, > YARN-6769.003.patch, YARN-6769.004.patch > > > When use fairsheduler as RM scheduler, before assign container we will sort > all queues or applications. > We will use FairSharePolicy#compare as the comparator, but the comparator is > not so perfect. > It have a problem as blow: > 1. when a queue use resource over minShare(minResources), it will put behind > the queue whose demand is zeor. > so it will greater opportunity to get the resource although it do not want. > It will waste schedule time when assign container > to queue or application. > I have fix it, and I will upload the patch to the jira. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6791) Add a new acl control to make YARN acl control perfect
[ https://issues.apache.org/jira/browse/YARN-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6791: - Attachment: YARN-6791.002.patch > Add a new acl control to make YARN acl control perfect > -- > > Key: YARN-6791 > URL: https://issues.apache.org/jira/browse/YARN-6791 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon > Fix For: 2.7.2 > > Attachments: screenshot-1.png, screenshot-2.png, YARN-6791.001.patch, > YARN-6791.002.patch > > > The yarn application acl control is not so perfect which could let us pretty > confused sometimes. > !screenshot-1.png! > The yarn acl is disabled default, but when we enable it. > You will find the others who neither is yarn admin nor queue admin can not > view the application detail. > It will show something as blow in YARN RM web ui: > !screenshot-2.png! > So when we enable those configs, you will find it is very inconvenient to > view the application status > when use RM web ui. > There are two ways to solve the problem: > 1. To make the web ui more perfect, and can allow user to login as any uses > he want. > Besides, it should provide a perfect verification mechanism. > 2. Add a config in YarnConfiguration, which can allow some uses view any > applications he want. > But he has not modify permissions. > In this way, the user can view all applications but can not kill other users > applications. > Of the above two solutions, I will choose the second solution. > It is low cost but is more useful. > I will work on this, and upload the patch later. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6769) Put the no demand queue after the most in FairSharePolicy#compare
[ https://issues.apache.org/jira/browse/YARN-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6769: - Attachment: YARN-6769.002.patch > Put the no demand queue after the most in FairSharePolicy#compare > - > > Key: YARN-6769 > URL: https://issues.apache.org/jira/browse/YARN-6769 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon >Priority: Minor > Fix For: 2.9.0 > > Attachments: YARN-6769.001.patch, YARN-6769.002.patch > > > When use fairsheduler as RM scheduler, before assign container we will sort > all queues or applications. > We will use FairSharePolicy#compare as the comparator, but the comparator is > not so perfect. > It have a problem as blow: > 1. when a queue use resource over minShare(minResources), it will put behind > the queue whose demand is zeor. > so it will greater opportunity to get the resource although it do not want. > It will waste schedule time when assign container > to queue or application. > I have fix it, and I will upload the patch to the jira. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6769) Put the no demand queue after the most in FairSharePolicy#compare
[ https://issues.apache.org/jira/browse/YARN-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079643#comment-16079643 ] daemon commented on YARN-6769: -- [~hadoopqa] I am sorry for my break the test, so I fix it and upload a new patch file. [~yufeigu] yufei, please help me review the code if you have free time, thanks a lot. > Put the no demand queue after the most in FairSharePolicy#compare > - > > Key: YARN-6769 > URL: https://issues.apache.org/jira/browse/YARN-6769 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon >Priority: Minor > Fix For: 2.9.0 > > Attachments: YARN-6769.001.patch, YARN-6769.002.patch > > > When use fairsheduler as RM scheduler, before assign container we will sort > all queues or applications. > We will use FairSharePolicy#compare as the comparator, but the comparator is > not so perfect. > It have a problem as blow: > 1. when a queue use resource over minShare(minResources), it will put behind > the queue whose demand is zeor. > so it will greater opportunity to get the resource although it do not want. > It will waste schedule time when assign container > to queue or application. > I have fix it, and I will upload the patch to the jira. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6802) Support view leaf queue am resource usage in RM web ui
[ https://issues.apache.org/jira/browse/YARN-6802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6802: - Attachment: YARN-6802.001.patch > Support view leaf queue am resource usage in RM web ui > -- > > Key: YARN-6802 > URL: https://issues.apache.org/jira/browse/YARN-6802 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon > Fix For: 2.8.0 > > Attachments: screenshot-1.png, screenshot-2.png, YARN-6802.001.patch > > > RM Web ui should support view leaf queue am resource usage. > !screenshot-2.png! > I will upload my patch later. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6769) Put the no demand queue after the most in FairSharePolicy#compare
[ https://issues.apache.org/jira/browse/YARN-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081923#comment-16081923 ] daemon commented on YARN-6769: -- [~yufeigu] Thanks yufei. You are right, I have already fix those problems. > Put the no demand queue after the most in FairSharePolicy#compare > - > > Key: YARN-6769 > URL: https://issues.apache.org/jira/browse/YARN-6769 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon >Priority: Minor > Fix For: 2.9.0 > > Attachments: YARN-6769.001.patch, YARN-6769.002.patch, > YARN-6769.003.patch > > > When use fairsheduler as RM scheduler, before assign container we will sort > all queues or applications. > We will use FairSharePolicy#compare as the comparator, but the comparator is > not so perfect. > It have a problem as blow: > 1. when a queue use resource over minShare(minResources), it will put behind > the queue whose demand is zeor. > so it will greater opportunity to get the resource although it do not want. > It will waste schedule time when assign container > to queue or application. > I have fix it, and I will upload the patch to the jira. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6769) Put the no demand queue after the most in FairSharePolicy#compare
[ https://issues.apache.org/jira/browse/YARN-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079869#comment-16079869 ] daemon commented on YARN-6769: -- [~templedf] hi, Daniel. Can you help review my code? Pretty thanks! > Put the no demand queue after the most in FairSharePolicy#compare > - > > Key: YARN-6769 > URL: https://issues.apache.org/jira/browse/YARN-6769 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon >Priority: Minor > Fix For: 2.9.0 > > Attachments: YARN-6769.001.patch, YARN-6769.002.patch > > > When use fairsheduler as RM scheduler, before assign container we will sort > all queues or applications. > We will use FairSharePolicy#compare as the comparator, but the comparator is > not so perfect. > It have a problem as blow: > 1. when a queue use resource over minShare(minResources), it will put behind > the queue whose demand is zeor. > so it will greater opportunity to get the resource although it do not want. > It will waste schedule time when assign container > to queue or application. > I have fix it, and I will upload the patch to the jira. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6516) FairScheduler:the algorithm of assignContainer is so slow for it only can assign a thousand containers per second
[ https://issues.apache.org/jira/browse/YARN-6516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon reassigned YARN-6516: Assignee: daemon > FairScheduler:the algorithm of assignContainer is so slow for it only can > assign a thousand containers per second > - > > Key: YARN-6516 > URL: https://issues.apache.org/jira/browse/YARN-6516 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: JackZhou >Assignee: daemon > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6769) Put the no demand queue after the most in FairSharePolicy#compare
[ https://issues.apache.org/jira/browse/YARN-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6769: - Attachment: YARN-6769.003.patch > Put the no demand queue after the most in FairSharePolicy#compare > - > > Key: YARN-6769 > URL: https://issues.apache.org/jira/browse/YARN-6769 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon >Priority: Minor > Fix For: 2.9.0 > > Attachments: YARN-6769.001.patch, YARN-6769.002.patch, > YARN-6769.003.patch > > > When use fairsheduler as RM scheduler, before assign container we will sort > all queues or applications. > We will use FairSharePolicy#compare as the comparator, but the comparator is > not so perfect. > It have a problem as blow: > 1. when a queue use resource over minShare(minResources), it will put behind > the queue whose demand is zeor. > so it will greater opportunity to get the resource although it do not want. > It will waste schedule time when assign container > to queue or application. > I have fix it, and I will upload the patch to the jira. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6802) Support view leaf queue am resource usage in RM web ui
[ https://issues.apache.org/jira/browse/YARN-6802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6802: - Description: RM Web ui should support view leaf queue am resource usage. !screenshot-1.png! I will upload my patch later. > Support view leaf queue am resource usage in RM web ui > -- > > Key: YARN-6802 > URL: https://issues.apache.org/jira/browse/YARN-6802 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon > Fix For: 2.8.0 > > Attachments: screenshot-1.png > > > RM Web ui should support view leaf queue am resource usage. > !screenshot-1.png! > I will upload my patch later. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6802) Support view leaf queue am resource usage in RM web ui
[ https://issues.apache.org/jira/browse/YARN-6802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6802: - Attachment: screenshot-2.png > Support view leaf queue am resource usage in RM web ui > -- > > Key: YARN-6802 > URL: https://issues.apache.org/jira/browse/YARN-6802 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon > Fix For: 2.8.0 > > Attachments: screenshot-1.png, screenshot-2.png > > > RM Web ui should support view leaf queue am resource usage. > !screenshot-1.png! > I will upload my patch later. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6802) Support view leaf queue am resource usage in RM web ui
[ https://issues.apache.org/jira/browse/YARN-6802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6802: - Description: RM Web ui should support view leaf queue am resource usage. !screenshot-2.png! I will upload my patch later. was: RM Web ui should support view leaf queue am resource usage. !screenshot-1.png! I will upload my patch later. > Support view leaf queue am resource usage in RM web ui > -- > > Key: YARN-6802 > URL: https://issues.apache.org/jira/browse/YARN-6802 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon > Fix For: 2.8.0 > > Attachments: screenshot-1.png, screenshot-2.png > > > RM Web ui should support view leaf queue am resource usage. > !screenshot-2.png! > I will upload my patch later. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6802) Support view leaf queue am resource usage in RM web ui
daemon created YARN-6802: Summary: Support view leaf queue am resource usage in RM web ui Key: YARN-6802 URL: https://issues.apache.org/jira/browse/YARN-6802 Project: Hadoop YARN Issue Type: Improvement Components: yarn Affects Versions: 2.7.2 Reporter: daemon Assignee: daemon Fix For: 2.8.0 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6802) Support view leaf queue am resource usage in RM web ui
[ https://issues.apache.org/jira/browse/YARN-6802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6802: - Attachment: screenshot-1.png > Support view leaf queue am resource usage in RM web ui > -- > > Key: YARN-6802 > URL: https://issues.apache.org/jira/browse/YARN-6802 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon > Fix For: 2.8.0 > > Attachments: screenshot-1.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6791) Add a new acl control to make YARN acl control perfect
[ https://issues.apache.org/jira/browse/YARN-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6791: - Attachment: YARN-6791.001.patch > Add a new acl control to make YARN acl control perfect > -- > > Key: YARN-6791 > URL: https://issues.apache.org/jira/browse/YARN-6791 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon > Fix For: 2.7.2 > > Attachments: screenshot-1.png, screenshot-2.png, YARN-6791.001.patch > > > The yarn application acl control is not so perfect which could let us pretty > confused sometimes. > !screenshot-1.png! > The yarn acl is disabled default, but when we enable it. > You will find the others who neither is yarn admin nor queue admin can not > view the application detail. > It will show something as blow in YARN RM web ui: > !screenshot-2.png! > So when we enable those configs, you will find it is very inconvenient to > view the application status > when use RM web ui. > There are two ways to solve the problem: > 1. To make the web ui more perfect, and can allow user to login as any uses > he want. > Besides, it should provide a perfect verification mechanism. > 2. Add a config in YarnConfiguration, which can allow some uses view any > applications he want. > But he has not modify permissions. > In this way, the user can view all applications but can not kill other users > applications. > Of the above two solutions, I will choose the second solution. > It is low cost but is more useful. > I will work on this, and upload the patch later. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6769) Put the no demand queue after the most in FairSharePolicy#compare
[ https://issues.apache.org/jira/browse/YARN-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6769: - Attachment: YARN-6769.004.patch > Put the no demand queue after the most in FairSharePolicy#compare > - > > Key: YARN-6769 > URL: https://issues.apache.org/jira/browse/YARN-6769 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon >Priority: Minor > Fix For: 2.9.0 > > Attachments: YARN-6769.001.patch, YARN-6769.002.patch, > YARN-6769.003.patch, YARN-6769.004.patch > > > When use fairsheduler as RM scheduler, before assign container we will sort > all queues or applications. > We will use FairSharePolicy#compare as the comparator, but the comparator is > not so perfect. > It have a problem as blow: > 1. when a queue use resource over minShare(minResources), it will put behind > the queue whose demand is zeor. > so it will greater opportunity to get the resource although it do not want. > It will waste schedule time when assign container > to queue or application. > I have fix it, and I will upload the patch to the jira. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6769) Put the no demand queue after the most in FairSharePolicy#compare
[ https://issues.apache.org/jira/browse/YARN-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085530#comment-16085530 ] daemon commented on YARN-6769: -- [~yufeigu] hi, yufei. Is there any other problems in my new patch? > Put the no demand queue after the most in FairSharePolicy#compare > - > > Key: YARN-6769 > URL: https://issues.apache.org/jira/browse/YARN-6769 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon >Priority: Minor > Fix For: 2.9.0 > > Attachments: YARN-6769.001.patch, YARN-6769.002.patch, > YARN-6769.003.patch, YARN-6769.004.patch > > > When use fairsheduler as RM scheduler, before assign container we will sort > all queues or applications. > We will use FairSharePolicy#compare as the comparator, but the comparator is > not so perfect. > It have a problem as blow: > 1. when a queue use resource over minShare(minResources), it will put behind > the queue whose demand is zeor. > so it will greater opportunity to get the resource although it do not want. > It will waste schedule time when assign container > to queue or application. > I have fix it, and I will upload the patch to the jira. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6802) Support view leaf queue am resource usage in RM web ui
[ https://issues.apache.org/jira/browse/YARN-6802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085601#comment-16085601 ] daemon commented on YARN-6802: -- [~sunilg] Can you review my patch? > Support view leaf queue am resource usage in RM web ui > -- > > Key: YARN-6802 > URL: https://issues.apache.org/jira/browse/YARN-6802 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon > Fix For: 2.8.0 > > Attachments: screenshot-1.png, screenshot-2.png, YARN-6802.001.patch > > > RM Web ui should support view leaf queue am resource usage. > !screenshot-2.png! > I will upload my patch later. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6791) Add a new acl control to make YARN acl control perfect
[ https://issues.apache.org/jira/browse/YARN-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6791: - Attachment: screenshot-1.png > Add a new acl control to make YARN acl control perfect > -- > > Key: YARN-6791 > URL: https://issues.apache.org/jira/browse/YARN-6791 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.7.2 >Reporter: daemon > Fix For: 2.7.2 > > Attachments: screenshot-1.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6791) Add a new acl control to make YARN acl control perfect
[ https://issues.apache.org/jira/browse/YARN-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6791: - Attachment: screenshot-2.png > Add a new acl control to make YARN acl control perfect > -- > > Key: YARN-6791 > URL: https://issues.apache.org/jira/browse/YARN-6791 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.7.2 >Reporter: daemon > Fix For: 2.7.2 > > Attachments: screenshot-1.png, screenshot-2.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6791) Add a new acl control to make YARN acl control perfect
daemon created YARN-6791: Summary: Add a new acl control to make YARN acl control perfect Key: YARN-6791 URL: https://issues.apache.org/jira/browse/YARN-6791 Project: Hadoop YARN Issue Type: Improvement Components: yarn Affects Versions: 2.7.2 Reporter: daemon Fix For: 2.7.2 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6791) Add a new acl control to make YARN acl control perfect
[ https://issues.apache.org/jira/browse/YARN-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6791: - Description: The yarn application acl control is not so perfect which could let us pretty confused sometimes. !screenshot-1.png! The yarn acl is disabled default, but when we enable it. You will find the others who neither is yarn admin nor queue admin can not view the application detail. It will show something as blow in YARN RM web ui: !screenshot-2.png! So when we enable those configs, you will find it is very inconvenient to view the application status when use RM web ui. There are two ways to solve the problem: 1. To make the web ui more perfect, and can allow user to login as any uses he want. Besides, it should provide a perfect verification mechanism. 2. Add a config in YarnConfiguration, which can allow some uses view any applications he want. But he has not modify permissions. In this way, the user can view all applications but can not kill other users applications. Of the above two solutions, I will choose the second solution. It is low cost but is more useful. I will work on this, and upload the patch later. was: The yarn application acl control is not so perfect which could let us pretty confused sometimes. !screenshot-1.png! The yarn acl is disabled default, but when we enable it. You will find the others who neither is yarn admin nor queue admin, he will not view the status of the application. It will show something as blow in YARN RM web ui: !screenshot-2.png! So when we enable those configs, you will find it is very inconvenient to view the application status for uses. There are two ways to solve the problem: 1. To make the web ui more perfect, and can allow user to login as any uses he want. Besides, it should provide a perfect verification mechanism. 2. Add a config in YarnConfiguration, which can allow some uses view any applications he want. But he has not modify permissions. In this way, the user can view all applications but can not kill other users applications. Of the above two solutions, I will choose the second solution. It is low cost but is more useful. I will work on this, and upload the patch later. > Add a new acl control to make YARN acl control perfect > -- > > Key: YARN-6791 > URL: https://issues.apache.org/jira/browse/YARN-6791 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon > Fix For: 2.7.2 > > Attachments: screenshot-1.png, screenshot-2.png > > > The yarn application acl control is not so perfect which could let us pretty > confused sometimes. > !screenshot-1.png! > The yarn acl is disabled default, but when we enable it. > You will find the others who neither is yarn admin nor queue admin can not > view the application detail. > It will show something as blow in YARN RM web ui: > !screenshot-2.png! > So when we enable those configs, you will find it is very inconvenient to > view the application status > when use RM web ui. > There are two ways to solve the problem: > 1. To make the web ui more perfect, and can allow user to login as any uses > he want. > Besides, it should provide a perfect verification mechanism. > 2. Add a config in YarnConfiguration, which can allow some uses view any > applications he want. > But he has not modify permissions. > In this way, the user can view all applications but can not kill other users > applications. > Of the above two solutions, I will choose the second solution. > It is low cost but is more useful. > I will work on this, and upload the patch later. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6769) Put the no demand queue after the most in FairSharePolicy#compare
[ https://issues.apache.org/jira/browse/YARN-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079453#comment-16079453 ] daemon commented on YARN-6769: -- [~yufei] Thanks yufei. I have already upload my patch file, what is the next I should do? > Put the no demand queue after the most in FairSharePolicy#compare > - > > Key: YARN-6769 > URL: https://issues.apache.org/jira/browse/YARN-6769 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon >Priority: Minor > Fix For: 2.9.0 > > Attachments: YARN-6769.001.patch > > > When use fairsheduler as RM scheduler, before assign container we will sort > all queues or applications. > We will use FairSharePolicy#compare as the comparator, but the comparator is > not so perfect. > It have a problem as blow: > 1. when a queue use resource over minShare(minResources), it will put behind > the queue whose demand is zeor. > so it will greater opportunity to get the resource although it do not want. > It will waste schedule time when assign container > to queue or application. > I have fix it, and I will upload the patch to the jira. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6769) Put the no demand queue after the most in FairSharePolicy#compare
[ https://issues.apache.org/jira/browse/YARN-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6769: - Attachment: YARN-6769.001.patch > Put the no demand queue after the most in FairSharePolicy#compare > - > > Key: YARN-6769 > URL: https://issues.apache.org/jira/browse/YARN-6769 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon >Priority: Minor > Fix For: 2.9.0 > > Attachments: YARN-6769.001.patch > > > When use fairsheduler as RM scheduler, before assign container we will sort > all queues or applications. > We will use FairSharePolicy#compare as the comparator, but the comparator is > not so perfect. > It have a problem as blow: > 1. when a queue use resource over minShare(minResources), it will put behind > the queue whose demand is zeor. > so it will greater opportunity to get the resource although it do not want. > It will waste schedule time when assign container > to queue or application. > I have fix it, and I will upload the patch to the jira. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6769) Put the no demand queue after the most in FairSharePolicy#compare
[ https://issues.apache.org/jira/browse/YARN-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079453#comment-16079453 ] daemon edited comment on YARN-6769 at 7/9/17 5:20 AM: -- [~yufei] Thanks yufei. I have already upload my patch file, what is the next I should do? was (Author: daemon): [~yufei] Thanks yufei. I have already upload my patch file, what is the next I should do? > Put the no demand queue after the most in FairSharePolicy#compare > - > > Key: YARN-6769 > URL: https://issues.apache.org/jira/browse/YARN-6769 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon >Priority: Minor > Fix For: 2.9.0 > > Attachments: YARN-6769.001.patch > > > When use fairsheduler as RM scheduler, before assign container we will sort > all queues or applications. > We will use FairSharePolicy#compare as the comparator, but the comparator is > not so perfect. > It have a problem as blow: > 1. when a queue use resource over minShare(minResources), it will put behind > the queue whose demand is zeor. > so it will greater opportunity to get the resource although it do not want. > It will waste schedule time when assign container > to queue or application. > I have fix it, and I will upload the patch to the jira. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6791) Add a new acl control to make YARN acl control perfect
[ https://issues.apache.org/jira/browse/YARN-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6791: - Description: The yarn application acl control is not so perfect which could let us pretty confused sometimes. !screenshot-1.png! The yarn acl is disabled default, but when we enable it. You will find the others who neither is yarn admin nor queue admin, he will not view the status of the application. It will show something as blow in YARN RM web ui: !screenshot-2.png! So when we enable those configs, you will find it is very inconvenient to view the application status for uses. There are two ways to solve the problem: 1. To make the web ui more perfect, and can allow user to login as any uses he want. Besides, it should provide a perfect verification mechanism. 2. Add a config in YarnConfiguration, which can allow some uses view any applications he want. But he has not modify permissions. In this way, the user can view all applications but can not kill other users applications. Of the above two solutions, I will choose the second solution. It is low cost but is more useful. I will work on this, and upload the patch later. > Add a new acl control to make YARN acl control perfect > -- > > Key: YARN-6791 > URL: https://issues.apache.org/jira/browse/YARN-6791 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.7.2 >Reporter: daemon > Fix For: 2.7.2 > > Attachments: screenshot-1.png, screenshot-2.png > > > The yarn application acl control is not so perfect which could let us pretty > confused sometimes. > !screenshot-1.png! > The yarn acl is disabled default, but when we enable it. > You will find the others who neither is yarn admin nor queue admin, he will > not view the status > of the application. It will show something as blow in YARN RM web ui: > !screenshot-2.png! > So when we enable those configs, you will find it is very inconvenient to > view the application status > for uses. > There are two ways to solve the problem: > 1. To make the web ui more perfect, and can allow user to login as any uses > he want. > Besides, it should provide a perfect verification mechanism. > 2. Add a config in YarnConfiguration, which can allow some uses view any > applications he want. > But he has not modify permissions. > In this way, the user can view all applications but can not kill other users > applications. > Of the above two solutions, I will choose the second solution. > It is low cost but is more useful. > I will work on this, and upload the patch later. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6791) Add a new acl control to make YARN acl control perfect
[ https://issues.apache.org/jira/browse/YARN-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon reassigned YARN-6791: Assignee: daemon > Add a new acl control to make YARN acl control perfect > -- > > Key: YARN-6791 > URL: https://issues.apache.org/jira/browse/YARN-6791 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.7.2 >Reporter: daemon >Assignee: daemon > Fix For: 2.7.2 > > Attachments: screenshot-1.png, screenshot-2.png > > > The yarn application acl control is not so perfect which could let us pretty > confused sometimes. > !screenshot-1.png! > The yarn acl is disabled default, but when we enable it. > You will find the others who neither is yarn admin nor queue admin, he will > not view the status > of the application. It will show something as blow in YARN RM web ui: > !screenshot-2.png! > So when we enable those configs, you will find it is very inconvenient to > view the application status > for uses. > There are two ways to solve the problem: > 1. To make the web ui more perfect, and can allow user to login as any uses > he want. > Besides, it should provide a perfect verification mechanism. > 2. Add a config in YarnConfiguration, which can allow some uses view any > applications he want. > But he has not modify permissions. > In this way, the user can view all applications but can not kill other users > applications. > Of the above two solutions, I will choose the second solution. > It is low cost but is more useful. > I will work on this, and upload the patch later. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6769) Put the no demand queue after the most in FairSharePolicy#compare
[ https://issues.apache.org/jira/browse/YARN-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6769: - Description: When use fairsheduler as RM scheduler, before assign container we will sort all queues or applications. We will use FairSharePolicy#compare as the comparator, but the comparator is not so perfect. It have a problem as blow: 1. when a queue use resource over minShare(minResources), it will put behind the queue whose demand is zeor. so it will greater opportunity to get the resource although it do not want. It will waste schedule time when assign container to queue or application. I have fix it, and I will upload the patch to the jira. > Put the no demand queue after the most in FairSharePolicy#compare > - > > Key: YARN-6769 > URL: https://issues.apache.org/jira/browse/YARN-6769 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon >Priority: Minor > Fix For: 2.9.0 > > > When use fairsheduler as RM scheduler, before assign container we will sort > all queues or applications. > We will use FairSharePolicy#compare as the comparator, but the comparator is > not so perfect. > It have a problem as blow: > 1. when a queue use resource over minShare(minResources), it will put behind > the queue whose demand is zeor. > so it will greater opportunity to get the resource although it do not want. > It will waste schedule time when assign container > to queue or application. > I have fix it, and I will upload the patch to the jira. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (YARN-6769) Put the no demand queue after the most in FairSharePolicy#compare
[ https://issues.apache.org/jira/browse/YARN-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6769: - Comment: was deleted (was: diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java index f8cdb45929..e930b80e45 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java @@ -79,6 +79,19 @@ public String getName() { @Override public int compare(Schedulable s1, Schedulable s2) { + Resource demand1 = s1.getDemand(); + Resource demand2 = s2.getDemand(); + // Put the schedulable which does not require resource to + // the end. So the other schedulable can get resource as soon as + // possible though it use resource greater then it minShare or demand. + if (demand1.equals(Resources.none()) && + !demand2.equals(Resources.none())) { +return 1; + } else if (demand2.equals(Resources.none()) && + !demand1.equals(Resources.none())) { +return -1; + } + double minShareRatio1, minShareRatio2; double useToWeightRatio1, useToWeightRatio2; double weight1, weight2; @@ -86,9 +99,9 @@ public int compare(Schedulable s1, Schedulable s2) { Resource resourceUsage1 = s1.getResourceUsage(); Resource resourceUsage2 = s2.getResourceUsage(); Resource minShare1 = Resources.min(RESOURCE_CALCULATOR, null, - s1.getMinShare(), s1.getDemand()); + s1.getMinShare(), demand1); Resource minShare2 = Resources.min(RESOURCE_CALCULATOR, null, - s2.getMinShare(), s2.getDemand()); + s2.getMinShare(), demand2); boolean s1Needy = Resources.lessThan(RESOURCE_CALCULATOR, null, resourceUsage1, minShare1); boolean s2Needy = Resources.lessThan(RESOURCE_CALCULATOR, null, ) > Put the no demand queue after the most in FairSharePolicy#compare > - > > Key: YARN-6769 > URL: https://issues.apache.org/jira/browse/YARN-6769 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon >Priority: Minor > Fix For: 2.9.0 > > > When use fairsheduler as RM scheduler, before assign container we will sort > all queues or applications. > We will use FairSharePolicy#compare as the comparator, but the comparator is > not so perfect. > It have a problem as blow: > 1. when a queue use resource over minShare(minResources), it will put behind > the queue whose demand is zeor. > so it will greater opportunity to get the resource although it do not want. > It will waste schedule time when assign container > to queue or application. > I have fix it, and I will upload the patch to the jira. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (YARN-6769) Put the no demand queue after the most in FairSharePolicy#compare
[ https://issues.apache.org/jira/browse/YARN-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6769: - Comment: was deleted (was: diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java index f8cdb45929..e930b80e45 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java @@ -79,6 79,19 @@ public String getName() { @Override public int compare(Schedulable s1, Schedulable s2) { Resource demand1 = s1.getDemand(); Resource demand2 = s2.getDemand(); // Put the schedulable which does not require resource to // the end. So the other schedulable can get resource as soon as // possible though it use resource greater then it minShare or demand. if (demand1.equals(Resources.none()) && !demand2.equals(Resources.none())) { return 1; } else if (demand2.equals(Resources.none()) && !demand1.equals(Resources.none())) { return -1; } double minShareRatio1, minShareRatio2; double useToWeightRatio1, useToWeightRatio2; double weight1, weight2; @@ -86,9 99,9 @@ public int compare(Schedulable s1, Schedulable s2) { Resource resourceUsage1 = s1.getResourceUsage(); Resource resourceUsage2 = s2.getResourceUsage(); Resource minShare1 = Resources.min(RESOURCE_CALCULATOR, null, - s1.getMinShare(), s1.getDemand()); s1.getMinShare(), demand1); Resource minShare2 = Resources.min(RESOURCE_CALCULATOR, null, - s2.getMinShare(), s2.getDemand()); s2.getMinShare(), demand2); boolean s1Needy = Resources.lessThan(RESOURCE_CALCULATOR, null, resourceUsage1, minShare1); boolean s2Needy = Resources.lessThan(RESOURCE_CALCULATOR, null, ) > Put the no demand queue after the most in FairSharePolicy#compare > - > > Key: YARN-6769 > URL: https://issues.apache.org/jira/browse/YARN-6769 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon >Priority: Minor > Fix For: 2.9.0 > > > When use fairsheduler as RM scheduler, before assign container we will sort > all queues or applications. > We will use FairSharePolicy#compare as the comparator, but the comparator is > not so perfect. > It have a problem as blow: > 1. when a queue use resource over minShare(minResources), it will put behind > the queue whose demand is zeor. > so it will greater opportunity to get the resource although it do not want. > It will waste schedule time when assign container > to queue or application. > I have fix it, and I will upload the patch to the jira. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (YARN-6769) Put the no demand queue after the most in FairSharePolicy#compare
[ https://issues.apache.org/jira/browse/YARN-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6769: - Comment: was deleted (was: {code:java} diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java index f8cdb45929..e930b80e45 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java @@ -79,6 +79,19 @@ public String getName() { @Override public int compare(Schedulable s1, Schedulable s2) { + Resource demand1 = s1.getDemand(); + Resource demand2 = s2.getDemand(); + // Put the schedulable which does not require resource to + // the end. So the other schedulable can get resource as soon as + // possible though it use resource greater then it minShare or demand. + if (demand1.equals(Resources.none()) && + !demand2.equals(Resources.none())) { +return 1; + } else if (demand2.equals(Resources.none()) && + !demand1.equals(Resources.none())) { +return -1; + } + double minShareRatio1, minShareRatio2; double useToWeightRatio1, useToWeightRatio2; double weight1, weight2; @@ -86,9 +99,9 @@ public int compare(Schedulable s1, Schedulable s2) { Resource resourceUsage1 = s1.getResourceUsage(); Resource resourceUsage2 = s2.getResourceUsage(); Resource minShare1 = Resources.min(RESOURCE_CALCULATOR, null, - s1.getMinShare(), s1.getDemand()); + s1.getMinShare(), demand1); Resource minShare2 = Resources.min(RESOURCE_CALCULATOR, null, - s2.getMinShare(), s2.getDemand()); + s2.getMinShare(), demand2); boolean s1Needy = Resources.lessThan(RESOURCE_CALCULATOR, null, resourceUsage1, minShare1); boolean s2Needy = Resources.lessThan(RESOURCE_CALCULATOR, null, {code} ) > Put the no demand queue after the most in FairSharePolicy#compare > - > > Key: YARN-6769 > URL: https://issues.apache.org/jira/browse/YARN-6769 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon >Priority: Minor > Fix For: 2.9.0 > > > When use fairsheduler as RM scheduler, before assign container we will sort > all queues or applications. > We will use FairSharePolicy#compare as the comparator, but the comparator is > not so perfect. > It have a problem as blow: > 1. when a queue use resource over minShare(minResources), it will put behind > the queue whose demand is zeor. > so it will greater opportunity to get the resource although it do not want. > It will waste schedule time when assign container > to queue or application. > I have fix it, and I will upload the patch to the jira. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6769) Put the no demand queue after the most in FairSharePolicy#compare
[ https://issues.apache.org/jira/browse/YARN-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16077525#comment-16077525 ] daemon commented on YARN-6769: -- diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java index f8cdb45929..e930b80e45 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java @@ -79,6 79,19 @@ public String getName() { @Override public int compare(Schedulable s1, Schedulable s2) { Resource demand1 = s1.getDemand(); Resource demand2 = s2.getDemand(); // Put the schedulable which does not require resource to // the end. So the other schedulable can get resource as soon as // possible though it use resource greater then it minShare or demand. if (demand1.equals(Resources.none()) && !demand2.equals(Resources.none())) { return 1; } else if (demand2.equals(Resources.none()) && !demand1.equals(Resources.none())) { return -1; } double minShareRatio1, minShareRatio2; double useToWeightRatio1, useToWeightRatio2; double weight1, weight2; @@ -86,9 99,9 @@ public int compare(Schedulable s1, Schedulable s2) { Resource resourceUsage1 = s1.getResourceUsage(); Resource resourceUsage2 = s2.getResourceUsage(); Resource minShare1 = Resources.min(RESOURCE_CALCULATOR, null, - s1.getMinShare(), s1.getDemand()); s1.getMinShare(), demand1); Resource minShare2 = Resources.min(RESOURCE_CALCULATOR, null, - s2.getMinShare(), s2.getDemand()); s2.getMinShare(), demand2); boolean s1Needy = Resources.lessThan(RESOURCE_CALCULATOR, null, resourceUsage1, minShare1); boolean s2Needy = Resources.lessThan(RESOURCE_CALCULATOR, null, > Put the no demand queue after the most in FairSharePolicy#compare > - > > Key: YARN-6769 > URL: https://issues.apache.org/jira/browse/YARN-6769 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon >Priority: Minor > Fix For: 2.9.0 > > > When use fairsheduler as RM scheduler, before assign container we will sort > all queues or applications. > We will use FairSharePolicy#compare as the comparator, but the comparator is > not so perfect. > It have a problem as blow: > 1. when a queue use resource over minShare(minResources), it will put behind > the queue whose demand is zeor. > so it will greater opportunity to get the resource although it do not want. > It will waste schedule time when assign container > to queue or application. > I have fix it, and I will upload the patch to the jira. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6769) Put the no demand queue after the most in FairSharePolicy#compare
daemon created YARN-6769: Summary: Put the no demand queue after the most in FairSharePolicy#compare Key: YARN-6769 URL: https://issues.apache.org/jira/browse/YARN-6769 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.7.2 Reporter: daemon Priority: Minor Fix For: 2.9.0 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6772) Several ways to improve fair scheduler schedule performance
[ https://issues.apache.org/jira/browse/YARN-6772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6772: - Summary: Several ways to improve fair scheduler schedule performance (was: Several way to improve fair scheduler schedule performance) > Several ways to improve fair scheduler schedule performance > --- > > Key: YARN-6772 > URL: https://issues.apache.org/jira/browse/YARN-6772 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon > Fix For: 2.7.2 > > > There are several ways to -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6772) Several way to improve fair scheduler schedule performance
[ https://issues.apache.org/jira/browse/YARN-6772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6772: - Description: There are several ways to > Several way to improve fair scheduler schedule performance > -- > > Key: YARN-6772 > URL: https://issues.apache.org/jira/browse/YARN-6772 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon > Fix For: 2.7.2 > > > There are several ways to -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6772) Several way to improve fair scheduler schedule performance
daemon created YARN-6772: Summary: Several way to improve fair scheduler schedule performance Key: YARN-6772 URL: https://issues.apache.org/jira/browse/YARN-6772 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Affects Versions: 2.7.2 Reporter: daemon Fix For: 2.7.2 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6772) Several ways to improve fair scheduler schedule performance
[ https://issues.apache.org/jira/browse/YARN-6772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6772: - Description: There are several ways to improve fair scheduler schedule performance, and it improve a lot performance in my test environment. We have run it in our production cluster, and the scheduler is pretty stable and faster. It can assign over 5000 containers per second, and sometimes over 1 containers. was:There are several ways to > Several ways to improve fair scheduler schedule performance > --- > > Key: YARN-6772 > URL: https://issues.apache.org/jira/browse/YARN-6772 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon > Fix For: 2.7.2 > > > There are several ways to improve fair scheduler schedule performance, and it > improve a lot performance in my test environment. > We have run it in our production cluster, and the scheduler is pretty stable > and faster. > It can assign over 5000 containers per second, and sometimes over 1 > containers. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6320) FairScheduler:Identifying apps to assign in updateThread
[ https://issues.apache.org/jira/browse/YARN-6320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16032452#comment-16032452 ] daemon commented on YARN-6320: -- I think I have solved the problem and in my test environment the scheduler can assign over 5000 containers per second. > FairScheduler:Identifying apps to assign in updateThread > > > Key: YARN-6320 > URL: https://issues.apache.org/jira/browse/YARN-6320 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Tao Jie > > In FairScheduler today, we have 1)UpdateThread that update queue/app status, > fairshare, starvation info, 2)nodeUpdate triggered by NM heartbeat that do > the scheduling. When we handle one nodeUpdate, we will top-down from the root > queue to the leafqueues and find the most needy application to allocate > container according to queue's fairshare. Also we should sort children at > each hierarchy level. > My thought is that we have a global sorted {{candidateAppList}} which keeps > apps need to assign, and move the logic that "find app that should allocate > resource to" from nodeUpdate to UpdateThread. In UpdateThread, we find > candidate apps to assign and put them into {{candidateAppList}}. In > nodeUpdate, we consume the list and allocate containers to apps. > As far as I see, we can have 3 benifits: > 1, nodeUpdate() is invoked much more frequently than update() in > UpdateThread, especially in a large cluster. As a result we can reduce much > unnecessary sorting. > 2, It will have better coordination with YARN-5829, we can indicate apps to > assign more directly rather than let nodes find the best apps to assign. > 3, It seems to be easier to introduce scheduling restricts such as nodelabel, > affinity/anti-affinity into FS, since we can pre-allocate containers > asynchronously. > [~kasha], [~templedf], [~yufeigu] like to hear your thoughts. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5814) Add druid as storage backend in YARN Timeline Service
[ https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16042759#comment-16042759 ] daemon commented on YARN-5814: -- [~BINGXUE QIU] hi, bingxue,can you upload your patch? Your patch is very useful for me! > Add druid as storage backend in YARN Timeline Service > -- > > Key: YARN-5814 > URL: https://issues.apache.org/jira/browse/YARN-5814 > Project: Hadoop YARN > Issue Type: New Feature > Components: ATSv2 >Affects Versions: 3.0.0-alpha2 >Reporter: Bingxue Qiu > Attachments: Add-Druid-in-YARN-Timeline-Service.pdf > > > h3. Introduction > I propose to add druid as storage backend in YARN Timeline Service. > We run more than 6000 applications and generate 450 million metrics daily in > Alibaba Clusters with thousands of nodes. We need to collect and store > meta/events/metrics data, online analyze the utilization reports of various > dimensions and display the trends of allocation/usage resources for cluster > by joining and aggregating data. It helps us to manage and optimize the > cluster by tracking resource utilization. > To achieve our goal we have changed to use druid as the storage instead of > HBase and have achieved sub-second OLAP performance in our production > environment for few months. > h3. Analysis > Currently YARN Timeline Service only supports aggregating metrics at a) flow > level by FlowRunCoprocessor and b) application level metrics aggregating by > AppLevelTimelineCollector, offline (time-based periodic) aggregation for > flows/users/queues for reporting and analysis is planned but not yet > implemented. YARN Timeline Service chooses Apache HBase as the primary > storage backend. As we all know that HBase doesn't fit for OLAP. > For arbitrary exploration of data,such as online analyze the utilization > reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by > joining and aggregating data, Druid's custom column format enables ad-hoc > queries without pre-computation. The format also enables fast scans on > columns, which is important for good aggregation performance. > To achieve our goal that support to online analyze the utilization reports of > various dimensions, display the variation trends of allocation/usage > resources for cluster, and arbitrary exploration of data, we propose to add > druid storage and implement DruidWriter /DruidReader in YARN Timeline Service. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6710) There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair scheduler not assign container to the queue
daemon created YARN-6710: Summary: There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair scheduler not assign container to the queue Key: YARN-6710 URL: https://issues.apache.org/jira/browse/YARN-6710 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.7.2 Reporter: daemon Fix For: 2.8.0 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6710) There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair scheduler not assign container to the queue
[ https://issues.apache.org/jira/browse/YARN-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6710: - Attachment: screenshot-2.png > There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair > scheduler not assign container to the queue > --- > > Key: YARN-6710 > URL: https://issues.apache.org/jira/browse/YARN-6710 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon > Fix For: 2.8.0 > > Attachments: screenshot-1.png, screenshot-2.png > > > There are over three thousand nodes in my hadoop production cluster, and we > use fair schedule as my scheduler. > Though there are many free resource in my resource manager, but there are -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6710) There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair scheduler not assign container to the queue
[ https://issues.apache.org/jira/browse/YARN-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6710: - Attachment: screenshot-3.png > There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair > scheduler not assign container to the queue > --- > > Key: YARN-6710 > URL: https://issues.apache.org/jira/browse/YARN-6710 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon > Fix For: 2.8.0 > > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png > > > There are over three thousand nodes in my hadoop production cluster, and we > use fair schedule as my scheduler. > Though there are many free resource in my resource manager, but there are -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6710) There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair scheduler not assign container to the queue
[ https://issues.apache.org/jira/browse/YARN-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6710: - Attachment: screenshot-1.png > There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair > scheduler not assign container to the queue > --- > > Key: YARN-6710 > URL: https://issues.apache.org/jira/browse/YARN-6710 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon > Fix For: 2.8.0 > > Attachments: screenshot-1.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6710) There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair scheduler not assign container to the queue
[ https://issues.apache.org/jira/browse/YARN-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6710: - Description: There are over three thousand nodes in my hadoop production cluster, and we use fair schedule as my scheduler. Though there are many free resource in my resource manager, but there are 46 applications pending. Those applications can not run after several hours, and in the end I have to stop them. I reproduce the scene in my test environment, and I find a bug in FSLeafQueue. In a extreme scenario it will let the FSLeafQueue#amResourceUsage greater than itself. When fair scheduler try to assign container to a application attempt, it will do as follow check: !screenshot-2.png! !screenshot-3.png! Because the value of FSLeafQueue#amResourceUsage is invalid, it will greater then it real value. So when the value of amResourceUsage greater than the value of Resources.multiply(getFairShare(), maxAMShare) , and the FSLeafQueue#canRunAppAM function will return false which will let the fair scheduler not assign container to the FSAppAttempt. In this scenario, all the application attempt will pending and never get any resource. I find the reason why so many applications in my leaf queue is pending. I will describe it as follow: When fair scheduler first assign a container to the application attempt, it will do something as blow: !screenshot-4.png! When fair scheduler remove the application attempt from the leaf queue, it will do something as blow: !screenshot-5.png! But when application attempt unregister itself, and all the container in the SchedulerApplicationAttempt#liveContainers are complete. There is a APP_ATTEMPT_REMOVED event will send to fair scheduler, but it is asynchronous. Before the application attempt is removed from FSLeafQueue, and there are pending request in FSAppAttempt. The fair scheduler will assign container to the FSAppAttempt, because the size of the liveContainers will equals to 1. So the FSLeafQueue will add to container resource to the FSLeafQueue#amResourceUsage, it will let the value of amResourceUsage greater then itself. In the end, the value of FSLeafQueue#amResourceUsage is preety large although there is no application it the queue. When new application come, and the value of FSLeafQueue#amResourceUsage greater than the value of Resources.multiply(getFairShare(), maxAMShare), it will let the scheduler never assign container to the queue. All of the applications in the queue will always pending. was: There are over three thousand nodes in my hadoop production cluster, and we use fair schedule as my scheduler. Though there are many free resource in my resource manager, but there are 46 applications pending. Those applications can not run after several hours, and in the end I have to stop them. I reproduce the scene in my test environment, and I find a bug in FSLeafQueue. In a extreme scenario it will let the FSLeafQueue#amResourceUsage greater than itself. When fair scheduler try to assign container to a application attempt, it will do as follow check: !screenshot-2.png! !screenshot-3.png! Because the value of FSLeafQueue#amResourceUsage is invalid, it will greater then it real value. So when the value of amResourceUsage greater than the value of Resources.multiply(getFairShare(), maxAMShare) , and the FSLeafQueue#canRunAppAM function will return false which will let the fair scheduler not assign container to the FSAppAttempt. In this scenario, all the application attempt will pending and never get any resource. I find the reason why so many applications in my leaf queue is pending. I will describe it as flow: > There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair > scheduler not assign container to the queue > --- > > Key: YARN-6710 > URL: https://issues.apache.org/jira/browse/YARN-6710 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon > Fix For: 2.8.0 > > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, > screenshot-4.png, screenshot-5.png > > > There are over three thousand nodes in my hadoop production cluster, and we > use fair schedule as my scheduler. > Though there are many free resource in my resource manager, but there are 46 > applications pending. > Those applications can not run after several hours, and in the end I have to > stop them. > I reproduce the scene in my test environment, and I find a bug in > FSLeafQueue. > In a extreme scenario it will let the FSLeafQueue#amResourceUsage greater > than itself. > When fair scheduler try to assign container to a application attempt, it > will do as
[jira] [Updated] (YARN-6710) There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair scheduler not assign container to the queue
[ https://issues.apache.org/jira/browse/YARN-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6710: - Description: There are over three thousand nodes in my hadoop production cluster, and we use fair schedule as my scheduler. Though my cluster is leisure but there are about > There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair > scheduler not assign container to the queue > --- > > Key: YARN-6710 > URL: https://issues.apache.org/jira/browse/YARN-6710 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon > Fix For: 2.8.0 > > Attachments: screenshot-1.png > > > There are over three thousand nodes in my hadoop production cluster, and we > use fair schedule as my scheduler. > Though my cluster is leisure but there are about -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6710) There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair scheduler not assign container to the queue
[ https://issues.apache.org/jira/browse/YARN-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048658#comment-16048658 ] daemon commented on YARN-6710: -- [~dan...@cloudera.com] I am sorry, I am try to express myself. But my english is so poor, so it is very slow for me to express myself. > There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair > scheduler not assign container to the queue > --- > > Key: YARN-6710 > URL: https://issues.apache.org/jira/browse/YARN-6710 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon > Fix For: 2.8.0 > > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png > > > There are over three thousand nodes in my hadoop production cluster, and we > use fair schedule as my scheduler. > Though there are many free resource in my resource manager, but there are 46 > applications pending. > Those applications can not run after several hours, and in the end I have to > stop them. > I reproduce the scene in my test environment, and I find a bug in > FSLeafQueue. > In a extreme scenario it will let the FSLeafQueue#amResourceUsage greater > than itself. > When fair scheduler try to assign container to a application attempt, it > will do as follow check: > !screenshot-2.png! > !screenshot-3.png! > Because the value of FSLeafQueue#amResourceUsage is invalid, it will greater > then it real value. > So when the value of amResourceUsage greater than the value of > Resources.multiply(getFairShare(), maxAMShare) , > and the FSLeafQueue#canRunAppAM function will return false which will let the > fair scheduler not assign container > to the FSAppAttempt. > In this scenario, all the application attempt will pending and never get any > resource. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6710) There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair scheduler not assign container to the queue
[ https://issues.apache.org/jira/browse/YARN-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6710: - Attachment: screenshot-4.png > There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair > scheduler not assign container to the queue > --- > > Key: YARN-6710 > URL: https://issues.apache.org/jira/browse/YARN-6710 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon > Fix For: 2.8.0 > > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, > screenshot-4.png > > > There are over three thousand nodes in my hadoop production cluster, and we > use fair schedule as my scheduler. > Though there are many free resource in my resource manager, but there are 46 > applications pending. > Those applications can not run after several hours, and in the end I have to > stop them. > I reproduce the scene in my test environment, and I find a bug in > FSLeafQueue. > In a extreme scenario it will let the FSLeafQueue#amResourceUsage greater > than itself. > When fair scheduler try to assign container to a application attempt, it > will do as follow check: > !screenshot-2.png! > !screenshot-3.png! > Because the value of FSLeafQueue#amResourceUsage is invalid, it will greater > then it real value. > So when the value of amResourceUsage greater than the value of > Resources.multiply(getFairShare(), maxAMShare) , > and the FSLeafQueue#canRunAppAM function will return false which will let the > fair scheduler not assign container > to the FSAppAttempt. > In this scenario, all the application attempt will pending and never get any > resource. > I find the reason why so many applications in my leaf queue is pending. I > will describe it as flow: -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6710) There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair scheduler not assign container to the queue
[ https://issues.apache.org/jira/browse/YARN-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6710: - Description: There are over three thousand nodes in my hadoop production cluster, and we use fair schedule as my scheduler. Though there are many free resource in my resource manager, but there are was: There are over three thousand nodes in my hadoop production cluster, and we use fair schedule as my scheduler. Though my cluster is leisure but there are about > There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair > scheduler not assign container to the queue > --- > > Key: YARN-6710 > URL: https://issues.apache.org/jira/browse/YARN-6710 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon > Fix For: 2.8.0 > > Attachments: screenshot-1.png > > > There are over three thousand nodes in my hadoop production cluster, and we > use fair schedule as my scheduler. > Though there are many free resource in my resource manager, but there are -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6710) There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair scheduler not assign container to the queue
[ https://issues.apache.org/jira/browse/YARN-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6710: - Description: There are over three thousand nodes in my hadoop production cluster, and we use fair schedule as my scheduler. Though there are many free resource in my resource manager, but there are 46 applications pending. Those applications can not run after several hours, and in the end I have to stop them. I reproduce the scene in my test environment, and I find a bug in FSLeafQueue. In a extreme scenario it will let the FSLeafQueue#amResourceUsage greater than itself. When fair scheduler try to assign container to a application attempt, it will do as follow check: !screenshot-2.png! !screenshot-3.png! Because the value of FSLeafQueue#amResourceUsage is invalid, it will greater then it real value. So when the value of amResourceUsage greater than the value of Resources.multiply(getFairShare(), maxAMShare) , and the FSLeafQueue#canRunAppAM function will return false which will let the fair scheduler not assign container to the FSAppAttempt. In this scenario, all the application attempt will pending and never get any resource. was: There are over three thousand nodes in my hadoop production cluster, and we use fair schedule as my scheduler. Though there are many free resource in my resource manager, but there are > There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair > scheduler not assign container to the queue > --- > > Key: YARN-6710 > URL: https://issues.apache.org/jira/browse/YARN-6710 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon > Fix For: 2.8.0 > > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png > > > There are over three thousand nodes in my hadoop production cluster, and we > use fair schedule as my scheduler. > Though there are many free resource in my resource manager, but there are 46 > applications pending. > Those applications can not run after several hours, and in the end I have to > stop them. > I reproduce the scene in my test environment, and I find a bug in > FSLeafQueue. > In a extreme scenario it will let the FSLeafQueue#amResourceUsage greater > than itself. > When fair scheduler try to assign container to a application attempt, it > will do as follow check: > !screenshot-2.png! > !screenshot-3.png! > Because the value of FSLeafQueue#amResourceUsage is invalid, it will greater > then it real value. > So when the value of amResourceUsage greater than the value of > Resources.multiply(getFairShare(), maxAMShare) , > and the FSLeafQueue#canRunAppAM function will return false which will let the > fair scheduler not assign container > to the FSAppAttempt. > In this scenario, all the application attempt will pending and never get any > resource. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6710) There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair scheduler not assign container to the queue
[ https://issues.apache.org/jira/browse/YARN-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6710: - Description: There are over three thousand nodes in my hadoop production cluster, and we use fair schedule as my scheduler. Though there are many free resource in my resource manager, but there are 46 applications pending. Those applications can not run after several hours, and in the end I have to stop them. I reproduce the scene in my test environment, and I find a bug in FSLeafQueue. In a extreme scenario it will let the FSLeafQueue#amResourceUsage greater than itself. When fair scheduler try to assign container to a application attempt, it will do as follow check: !screenshot-2.png! !screenshot-3.png! Because the value of FSLeafQueue#amResourceUsage is invalid, it will greater then it real value. So when the value of amResourceUsage greater than the value of Resources.multiply(getFairShare(), maxAMShare) , and the FSLeafQueue#canRunAppAM function will return false which will let the fair scheduler not assign container to the FSAppAttempt. In this scenario, all the application attempt will pending and never get any resource. I find the reason why so many applications in my leaf queue is pending. I will describe it as flow: was: There are over three thousand nodes in my hadoop production cluster, and we use fair schedule as my scheduler. Though there are many free resource in my resource manager, but there are 46 applications pending. Those applications can not run after several hours, and in the end I have to stop them. I reproduce the scene in my test environment, and I find a bug in FSLeafQueue. In a extreme scenario it will let the FSLeafQueue#amResourceUsage greater than itself. When fair scheduler try to assign container to a application attempt, it will do as follow check: !screenshot-2.png! !screenshot-3.png! Because the value of FSLeafQueue#amResourceUsage is invalid, it will greater then it real value. So when the value of amResourceUsage greater than the value of Resources.multiply(getFairShare(), maxAMShare) , and the FSLeafQueue#canRunAppAM function will return false which will let the fair scheduler not assign container to the FSAppAttempt. In this scenario, all the application attempt will pending and never get any resource. > There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair > scheduler not assign container to the queue > --- > > Key: YARN-6710 > URL: https://issues.apache.org/jira/browse/YARN-6710 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon > Fix For: 2.8.0 > > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png > > > There are over three thousand nodes in my hadoop production cluster, and we > use fair schedule as my scheduler. > Though there are many free resource in my resource manager, but there are 46 > applications pending. > Those applications can not run after several hours, and in the end I have to > stop them. > I reproduce the scene in my test environment, and I find a bug in > FSLeafQueue. > In a extreme scenario it will let the FSLeafQueue#amResourceUsage greater > than itself. > When fair scheduler try to assign container to a application attempt, it > will do as follow check: > !screenshot-2.png! > !screenshot-3.png! > Because the value of FSLeafQueue#amResourceUsage is invalid, it will greater > then it real value. > So when the value of amResourceUsage greater than the value of > Resources.multiply(getFairShare(), maxAMShare) , > and the FSLeafQueue#canRunAppAM function will return false which will let the > fair scheduler not assign container > to the FSAppAttempt. > In this scenario, all the application attempt will pending and never get any > resource. > I find the reason why so many applications in my leaf queue is pending. I > will describe it as flow: -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6710) There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair scheduler not assign container to the queue
[ https://issues.apache.org/jira/browse/YARN-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] daemon updated YARN-6710: - Attachment: screenshot-5.png > There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair > scheduler not assign container to the queue > --- > > Key: YARN-6710 > URL: https://issues.apache.org/jira/browse/YARN-6710 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon > Fix For: 2.8.0 > > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, > screenshot-4.png, screenshot-5.png > > > There are over three thousand nodes in my hadoop production cluster, and we > use fair schedule as my scheduler. > Though there are many free resource in my resource manager, but there are 46 > applications pending. > Those applications can not run after several hours, and in the end I have to > stop them. > I reproduce the scene in my test environment, and I find a bug in > FSLeafQueue. > In a extreme scenario it will let the FSLeafQueue#amResourceUsage greater > than itself. > When fair scheduler try to assign container to a application attempt, it > will do as follow check: > !screenshot-2.png! > !screenshot-3.png! > Because the value of FSLeafQueue#amResourceUsage is invalid, it will greater > then it real value. > So when the value of amResourceUsage greater than the value of > Resources.multiply(getFairShare(), maxAMShare) , > and the FSLeafQueue#canRunAppAM function will return false which will let the > fair scheduler not assign container > to the FSAppAttempt. > In this scenario, all the application attempt will pending and never get any > resource. > I find the reason why so many applications in my leaf queue is pending. I > will describe it as flow: -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6710) There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair scheduler not assign container to the queue
[ https://issues.apache.org/jira/browse/YARN-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16053715#comment-16053715 ] daemon commented on YARN-6710: -- [~yufeigu] 使用中文表述可能更清楚些, 这个问题导致的原因对于YARN端主要是由于 1. Application attempt运行完成之后,AM 向RM发送unregisterApplicationMaster RPC请求。RM在处理 这个消息时,做些简单的处理然后就向FairScheduler发送APP_ATTEMPT_REMOVED消息就返回了。 而APP_ATTEMPT_REMOVED的处理是异步的,所以在FairScheduler中,对应的FSAppAttempt会过段时间 才会被remove掉。 这个问题会导致两个比较严重的后果发生: 1. 在这个时间间隔,FairScheduler还会给FSAppAttempt 分派Container。 并且会在分派Container的时候,如果 if (getLiveContainers().size() == 1 && !getUnmanagedAM()) 情况满足的话,会继续累加am resource的值到amResourceUsage,使得amResourceUsage的值比实际的值大很多。 在实际的情况中,可能会导致队列中的 作业一直pending,并且永远得不到资源, 这个就是我在上面描述的情况。 对于amResourceUsage统计的值比实际大很多问题,社区已经有patch fix这个问题了。 具体可以查看这个jira: https://issues.apache.org/jira/browse/YARN-3415。 2. 导致FairScheduler会给已经Finished的Application attempt分派Container, 虽然对应的Container,在NM汇报 心跳的时候,RM会给NM发送Response,让对应的NM cleanup它。 但是会造成资源的浪费。 并且目前调度速度那么快, 这种问题会更加明显。 虽然社区版本中已经解决了amResourceUsage的问题,但我觉得它只是解决了问题域中的一部分。 上述的问题2也是急需要解决的问题。 虽然我看到YARN-3415对应的也解决了Spark框架中unreigster application attempt之前把对应的pending的申请资源申请都清空了。 但是YARN作为一个通用的资源分派框架是需要Cover这些所有可能遇到的情况。对于一个通用的资源分派框架,我们不能限定用户的使用方式。 不能依赖用户每次unregister application master的时候,会在之前释放所有pending的request。 所以,我们需要在分派container之前就要做对应的判断,这个是急需解决的问题。麻烦yufei根据我所说的,再 评估下这个问题有没有需要解决。 谢谢, > There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair > scheduler not assign container to the queue > --- > > Key: YARN-6710 > URL: https://issues.apache.org/jira/browse/YARN-6710 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: daemon > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, > screenshot-4.png, screenshot-5.png > > > There are over three thousand nodes in my hadoop production cluster, and we > use fair schedule as my scheduler. > Though there are many free resource in my resource manager, but there are 46 > applications pending. > Those applications can not run after several hours, and in the end I have to > stop them. > I reproduce the scene in my test environment, and I find a bug in > FSLeafQueue. > In a extreme scenario it will let the FSLeafQueue#amResourceUsage greater > than itself. > When fair scheduler try to assign container to a application attempt, it > will do as follow check: > !screenshot-2.png! > !screenshot-3.png! > Because the value of FSLeafQueue#amResourceUsage is invalid, it will greater > then it real value. > So when the value of amResourceUsage greater than the value of > Resources.multiply(getFairShare(), maxAMShare) , > and the FSLeafQueue#canRunAppAM function will return false which will let the > fair scheduler not assign container > to the FSAppAttempt. > In this scenario, all the application attempt will pending and never get any > resource. > I find the reason why so many applications in my leaf queue is pending. I > will describe it as follow: > When fair scheduler first assign a container to the application attempt, it > will do something as blow: > !screenshot-4.png! > When fair scheduler remove the application attempt from the leaf queue, it > will do something as blow: > !screenshot-5.png! > But when application attempt unregister itself, and all the container in the > SchedulerApplicationAttempt#liveContainers > are complete. There is a APP_ATTEMPT_REMOVED event will send to fair > scheduler, but it is asynchronous. > Before the application attempt is removed from FSLeafQueue, and there are > pending request in FSAppAttempt. > The fair scheduler will assign container to the FSAppAttempt, because the > size of the liveContainers will equals to > 1. > So the FSLeafQueue will add to container resource to the > FSLeafQueue#amResourceUsage, it will > let the value of amResourceUsage greater then itself. > In the end, the value of FSLeafQueue#amResourceUsage is preety large although > there is no application > it the queue. > When new application come, and the value of FSLeafQueue#amResourceUsage > greater than the value > of Resources.multiply(getFairShare(), maxAMShare), it will let the scheduler > never assign container to > the queue. > All of the applications in the queue will always pending. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org