[jira] [Commented] (YARN-6481) Yarn top shows negative container number in FS
[ https://issues.apache.org/jira/browse/YARN-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15991952#comment-15991952 ] Tao Jie commented on YARN-6481: --- Thank you [~yufeigu], I added check in {{TestFairScheduler.testQueueInfo()}} and ensure metrics here really work. > Yarn top shows negative container number in FS > -- > > Key: YARN-6481 > URL: https://issues.apache.org/jira/browse/YARN-6481 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.9.0 >Reporter: Yufei Gu >Assignee: Tao Jie > Labels: newbie > Attachments: YARN-6481.001.patch, YARN-6481.002.patch > > > yarn top shows negative container numbers and they didn't change even they > were supposed to. > {code} > NodeManager(s): 2 total, 2 active, 0 unhealthy, 0 decommissioned, 0 lost, 0 > rebooted > Queue(s) Applications: 0 running, 12 submitted, 0 pending, 12 completed, 0 > killed, 0 failed > Queue(s) Mem(GB): 0 available, 0 allocated, 0 pending, 0 reserved > Queue(s) VCores: 0 available, 0 allocated, 0 pending, 0 reserved > Queue(s) Containers: -2 allocated, -2 pending, -2 reserved > APPLICATIONID USER TYPE QUEUE #CONT > #RCONT VCORES RVC > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6481) Yarn top shows negative container number in FS
[ https://issues.apache.org/jira/browse/YARN-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-6481: -- Attachment: (was: YARN-6481.002.patch) > Yarn top shows negative container number in FS > -- > > Key: YARN-6481 > URL: https://issues.apache.org/jira/browse/YARN-6481 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.9.0 >Reporter: Yufei Gu >Assignee: Tao Jie > Labels: newbie > Attachments: YARN-6481.001.patch, YARN-6481.002.patch > > > yarn top shows negative container numbers and they didn't change even they > were supposed to. > {code} > NodeManager(s): 2 total, 2 active, 0 unhealthy, 0 decommissioned, 0 lost, 0 > rebooted > Queue(s) Applications: 0 running, 12 submitted, 0 pending, 12 completed, 0 > killed, 0 failed > Queue(s) Mem(GB): 0 available, 0 allocated, 0 pending, 0 reserved > Queue(s) VCores: 0 available, 0 allocated, 0 pending, 0 reserved > Queue(s) Containers: -2 allocated, -2 pending, -2 reserved > APPLICATIONID USER TYPE QUEUE #CONT > #RCONT VCORES RVC > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6481) Yarn top shows negative container number in FS
[ https://issues.apache.org/jira/browse/YARN-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-6481: -- Attachment: YARN-6481.002.patch > Yarn top shows negative container number in FS > -- > > Key: YARN-6481 > URL: https://issues.apache.org/jira/browse/YARN-6481 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.9.0 >Reporter: Yufei Gu >Assignee: Tao Jie > Labels: newbie > Attachments: YARN-6481.001.patch, YARN-6481.002.patch > > > yarn top shows negative container numbers and they didn't change even they > were supposed to. > {code} > NodeManager(s): 2 total, 2 active, 0 unhealthy, 0 decommissioned, 0 lost, 0 > rebooted > Queue(s) Applications: 0 running, 12 submitted, 0 pending, 12 completed, 0 > killed, 0 failed > Queue(s) Mem(GB): 0 available, 0 allocated, 0 pending, 0 reserved > Queue(s) VCores: 0 available, 0 allocated, 0 pending, 0 reserved > Queue(s) Containers: -2 allocated, -2 pending, -2 reserved > APPLICATIONID USER TYPE QUEUE #CONT > #RCONT VCORES RVC > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6481) Yarn top shows negative container number in FS
[ https://issues.apache.org/jira/browse/YARN-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-6481: -- Attachment: YARN-6481.002.patch > Yarn top shows negative container number in FS > -- > > Key: YARN-6481 > URL: https://issues.apache.org/jira/browse/YARN-6481 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.9.0 >Reporter: Yufei Gu >Assignee: Tao Jie > Labels: newbie > Attachments: YARN-6481.001.patch, YARN-6481.002.patch > > > yarn top shows negative container numbers and they didn't change even they > were supposed to. > {code} > NodeManager(s): 2 total, 2 active, 0 unhealthy, 0 decommissioned, 0 lost, 0 > rebooted > Queue(s) Applications: 0 running, 12 submitted, 0 pending, 12 completed, 0 > killed, 0 failed > Queue(s) Mem(GB): 0 available, 0 allocated, 0 pending, 0 reserved > Queue(s) VCores: 0 available, 0 allocated, 0 pending, 0 reserved > Queue(s) Containers: -2 allocated, -2 pending, -2 reserved > APPLICATIONID USER TYPE QUEUE #CONT > #RCONT VCORES RVC > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6481) Yarn top shows negative container number in FS
[ https://issues.apache.org/jira/browse/YARN-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-6481: -- Attachment: YARN-6481.001.patch > Yarn top shows negative container number in FS > -- > > Key: YARN-6481 > URL: https://issues.apache.org/jira/browse/YARN-6481 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.9.0 >Reporter: Yufei Gu > Labels: newbie > Attachments: YARN-6481.001.patch > > > yarn top shows negative container numbers and they didn't change even they > were supposed to. > {code} > NodeManager(s): 2 total, 2 active, 0 unhealthy, 0 decommissioned, 0 lost, 0 > rebooted > Queue(s) Applications: 0 running, 12 submitted, 0 pending, 12 completed, 0 > killed, 0 failed > Queue(s) Mem(GB): 0 available, 0 allocated, 0 pending, 0 reserved > Queue(s) VCores: 0 available, 0 allocated, 0 pending, 0 reserved > Queue(s) Containers: -2 allocated, -2 pending, -2 reserved > APPLICATIONID USER TYPE QUEUE #CONT > #RCONT VCORES RVC > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6481) Yarn top shows negative container number in FS
[ https://issues.apache.org/jira/browse/YARN-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15988861#comment-15988861 ] Tao Jie commented on YARN-6481: --- When generating QueueStatistics instance in FSQueue, metrics about containers are missed. [~yufeigu][~kasha], I uploaded a patch, would you give it a review? > Yarn top shows negative container number in FS > -- > > Key: YARN-6481 > URL: https://issues.apache.org/jira/browse/YARN-6481 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.9.0 >Reporter: Yufei Gu > Labels: newbie > Attachments: YARN-6481.001.patch > > > yarn top shows negative container numbers and they didn't change even they > were supposed to. > {code} > NodeManager(s): 2 total, 2 active, 0 unhealthy, 0 decommissioned, 0 lost, 0 > rebooted > Queue(s) Applications: 0 running, 12 submitted, 0 pending, 12 completed, 0 > killed, 0 failed > Queue(s) Mem(GB): 0 available, 0 allocated, 0 pending, 0 reserved > Queue(s) VCores: 0 available, 0 allocated, 0 pending, 0 reserved > Queue(s) Containers: -2 allocated, -2 pending, -2 reserved > APPLICATIONID USER TYPE QUEUE #CONT > #RCONT VCORES RVC > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6380) FSAppAttempt keeps redundant copy of the queue
[ https://issues.apache.org/jira/browse/YARN-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15939735#comment-15939735 ] Tao Jie commented on YARN-6380: --- [~templedf], in current FSAppAttempt constructor: {code} public FSAppAttempt(FairScheduler scheduler, ApplicationAttemptId applicationAttemptId, String user, FSLeafQueue queue, ActiveUsersManager activeUsersManager, RMContext rmContext) { super(applicationAttemptId, user, queue, activeUsersManager, rmContext); this.scheduler = scheduler; this.fsQueue = queue; this.startTime = scheduler.getClock().getTime(); this.lastTimeAtFairShare = this.startTime; this.appPriority = Priority.newInstance(1); this.resourceWeights = new ResourceWeights(); } {code} It seems to me that we create another reference of queue rather than copy of the queue. So I think both {{SchedulerApplicationAttempt#queue}} and {{SchedulerApplicationAttempt#queue}} indicates to the same object. It ok to avoid redundant field in FSAppAttempt, but we should convert Queue to FsQueue when using it. > FSAppAttempt keeps redundant copy of the queue > -- > > Key: YARN-6380 > URL: https://issues.apache.org/jira/browse/YARN-6380 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 3.0.0-alpha2 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Attachments: YARN-6380.001.patch > > > The {{FSAppAttempt}} class defines its own {{fsQueue}} variable that is a > second copy of the {{SchedulerApplicationAttempt}}'s {{queue}} variable. > Aside from being redundant, it's also a bug, because when moving > applications, we only update the {{SchedulerApplicationAttempt}}'s {{queue}}, > not the {{FSAppAttempt}}'s {{fsQueue}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6320) FairScheduler:Identifying apps to assign in updateThread
[ https://issues.apache.org/jira/browse/YARN-6320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15923469#comment-15923469 ] Tao Jie commented on YARN-6320: --- [~kasha] thank you for sharing your comment. I try to do like that, we compute resource deficit for every app considering policies in updateThread. In the list of apps maintained, we sort apps by the amount of resource deficit. And generally we try to assign container to apps with the most resource deficit in each nodeUpdate. > FairScheduler:Identifying apps to assign in updateThread > > > Key: YARN-6320 > URL: https://issues.apache.org/jira/browse/YARN-6320 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Tao Jie > > In FairScheduler today, we have 1)UpdateThread that update queue/app status, > fairshare, starvation info, 2)nodeUpdate triggered by NM heartbeat that do > the scheduling. When we handle one nodeUpdate, we will top-down from the root > queue to the leafqueues and find the most needy application to allocate > container according to queue's fairshare. Also we should sort children at > each hierarchy level. > My thought is that we have a global sorted {{candidateAppList}} which keeps > apps need to assign, and move the logic that "find app that should allocate > resource to" from nodeUpdate to UpdateThread. In UpdateThread, we find > candidate apps to assign and put them into {{candidateAppList}}. In > nodeUpdate, we consume the list and allocate containers to apps. > As far as I see, we can have 3 benifits: > 1, nodeUpdate() is invoked much more frequently than update() in > UpdateThread, especially in a large cluster. As a result we can reduce much > unnecessary sorting. > 2, It will have better coordination with YARN-5829, we can indicate apps to > assign more directly rather than let nodes find the best apps to assign. > 3, It seems to be easier to introduce scheduling restricts such as nodelabel, > affinity/anti-affinity into FS, since we can pre-allocate containers > asynchronously. > [~kasha], [~templedf], [~yufeigu] like to hear your thoughts. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6320) FairScheduler:Identifying apps to assign in updateThread
Tao Jie created YARN-6320: - Summary: FairScheduler:Identifying apps to assign in updateThread Key: YARN-6320 URL: https://issues.apache.org/jira/browse/YARN-6320 Project: Hadoop YARN Issue Type: Bug Reporter: Tao Jie In FairScheduler today, we have 1)UpdateThread that update queue/app status, fairshare, starvation info, 2)nodeUpdate triggered by NM heartbeat that do the scheduling. When we handle one nodeUpdate, we will top-down from the root queue to the leafqueues and find the most needy application to allocate container according to queue's fairshare. Also we should sort children at each hierarchy level. My thought is that we have a global sorted {{candidateAppList}} which keeps apps need to assign, and move the logic that "find app that should allocate resource to" from nodeUpdate to UpdateThread. In UpdateThread, we find candidate apps to assign and put them into {{candidateAppList}}. In nodeUpdate, we consume the list and allocate containers to apps. As far as I see, we can have 3 benifits: 1, nodeUpdate() is invoked much more frequently than update() in UpdateThread, especially in a large cluster. As a result we can reduce much unnecessary sorting. 2, It will have better coordination with YARN-5829, we can indicate apps to assign more directly rather than let nodes find the best apps to assign. 3, It seems to be easier to introduce scheduling restricts such as nodelabel, affinity/anti-affinity into FS, since we can pre-allocate containers asynchronously. [~kasha], [~templedf], [~yufeigu] like to hear your thoughts. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5829) FS preemption should reserve a node before considering containers on it for preemption
[ https://issues.apache.org/jira/browse/YARN-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904584#comment-15904584 ] Tao Jie commented on YARN-5829: --- Thank you [~miklos.szeg...@cloudera.com] for sharing your thought. 1, It is easy to confuse the reservation we are taking about with the current reservation mechanism in scheduler. IIRC, the purpose of current reservation is to prevent starvation of request with large resource. And our reservation here is to assign container on node to one exact application. 2, I feel both OK about 1)reuse/extend current reservation mechanism or 2)add another logic to handle the reservation for preemption. If is 2), it's better to find another name to avoid naming confusion. 3, {quote} 2. We also need to be careful with prioritizing reservations. For example how it works now is that a reservation takes priority before any other request. What happens, if I have a preemption from a lower priority request but there is a demand from a higher priority application? {quote} In my opinion, the reservation for preemption should have higher priority than current reservation in allocation. If starved application that triggers preempting is not satisfied as soon as possible, it will still in starvation and try to preempt more containers. However a normal application has reservation container on nodes would wait for a while since the resource is allocated to another starved application, it makes sense that application would get higher priority when itself becomes a starved application. > FS preemption should reserve a node before considering containers on it for > preemption > -- > > Key: YARN-5829 > URL: https://issues.apache.org/jira/browse/YARN-5829 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Karthik Kambatla >Assignee: Miklos Szegedi > > FS preemption evaluates nodes for preemption, and subsequently preempts > identified containers. If this node is not reserved for a specific > application, any other application could be allocated resources on this node. > Reserving the node for the starved application before preempting containers > would help avoid this. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6301) Fair scheduler docs should explain the meaning of setting a queue's weight to zero
[ https://issues.apache.org/jira/browse/YARN-6301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904277#comment-15904277 ] Tao Jie commented on YARN-6301: --- Thank you [~templedf]. It is almost clear to me now. One thing I'd like to make it clear that "ad hoc queue" only works among its sibling queues. If one queue under another parent-queue has demand for resource, the "ad hoc queue" can still have resource due to fairshare of its parent-queue. If I am wrong, please correct me. > Fair scheduler docs should explain the meaning of setting a queue's weight to > zero > -- > > Key: YARN-6301 > URL: https://issues.apache.org/jira/browse/YARN-6301 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.0.0-alpha2 >Reporter: Daniel Templeton >Assignee: Tao Jie > Labels: docs > Attachments: YARN-6301.001.patch, YARN-6301.002.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6301) Fair scheduler docs should explain the meaning of setting a queue's weight to zero
[ https://issues.apache.org/jira/browse/YARN-6301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-6301: -- Attachment: YARN-6301.002.patch > Fair scheduler docs should explain the meaning of setting a queue's weight to > zero > -- > > Key: YARN-6301 > URL: https://issues.apache.org/jira/browse/YARN-6301 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.0.0-alpha2 >Reporter: Daniel Templeton >Assignee: Tao Jie > Labels: docs > Attachments: YARN-6301.001.patch, YARN-6301.002.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6246) Identifying starved apps does not need the scheduler writelock
[ https://issues.apache.org/jira/browse/YARN-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904237#comment-15904237 ] Tao Jie commented on YARN-6246: --- Thank you [~kasha] for working on this. It seems to me that checking starvation is only works on leafqueue, can we just go through leafqueues by {{queueMgr.getLeafQueues()}} rather than a top-down approach? I hope it could be more efficient. > Identifying starved apps does not need the scheduler writelock > -- > > Key: YARN-6246 > URL: https://issues.apache.org/jira/browse/YARN-6246 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Affects Versions: 2.9.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: YARN-6246.001.patch > > > Currently, the starvation checks are done holding the scheduler writelock. We > are probably better of doing this outside. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6301) Fair scheduler docs should explain the meaning of setting a queue's weight to zero
[ https://issues.apache.org/jira/browse/YARN-6301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902659#comment-15902659 ] Tao Jie commented on YARN-6301: --- Attached a patch and improved the docs in FairScheduler.md > Fair scheduler docs should explain the meaning of setting a queue's weight to > zero > -- > > Key: YARN-6301 > URL: https://issues.apache.org/jira/browse/YARN-6301 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.0.0-alpha2 >Reporter: Daniel Templeton > Labels: docs > Attachments: YARN-6301.001.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6301) Fair scheduler docs should explain the meaning of setting a queue's weight to zero
[ https://issues.apache.org/jira/browse/YARN-6301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-6301: -- Attachment: YARN-6301.001.patch > Fair scheduler docs should explain the meaning of setting a queue's weight to > zero > -- > > Key: YARN-6301 > URL: https://issues.apache.org/jira/browse/YARN-6301 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.0.0-alpha2 >Reporter: Daniel Templeton > Labels: docs > Attachments: YARN-6301.001.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6301) Fair scheduler docs should explain the meaning of setting a queue's weight to zero
[ https://issues.apache.org/jira/browse/YARN-6301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902618#comment-15902618 ] Tao Jie commented on YARN-6301: --- [~templedf], today queue's weight is allowed to be zero even negative. It seems to me that the queue could not get any share more than the MinResource in that case, am I correct? Should we add a non-negative check here since negative weight of queue is more confusing? > Fair scheduler docs should explain the meaning of setting a queue's weight to > zero > -- > > Key: YARN-6301 > URL: https://issues.apache.org/jira/browse/YARN-6301 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.0.0-alpha2 >Reporter: Daniel Templeton > Labels: docs > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6307) Refactor FairShareComparator#compare
[ https://issues.apache.org/jira/browse/YARN-6307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902482#comment-15902482 ] Tao Jie commented on YARN-6307: --- Thank you [~yufeigu], FairShareComparator#compare is called very frequently in each container allocation process. It would improve the scheduler performance if we can simplify this method. Furthermore, I don't think it is necessary to sort the queue hierarchy from the root to leafqueue in every node update. Can we do the sort in the update thread, then share the result for node update? It would reduce much redundant sort. Maybe we can improve this in another JIRA. > Refactor FairShareComparator#compare > > > Key: YARN-6307 > URL: https://issues.apache.org/jira/browse/YARN-6307 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Yufei Gu >Assignee: Yufei Gu > > The method did three things: check the min share ratio, check weight ratio, > break tied by submit time and name. They are mixed with each other which is > not easy to read and maintenance, poor style. Additionally, there are > potential performance issues, like no need to calculate weight ratio every > time. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5881) Enable configuration of queue capacity in terms of absolute resources
[ https://issues.apache.org/jira/browse/YARN-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900575#comment-15900575 ] Tao Jie commented on YARN-5881: --- Thank you [~leftnoteasy], it seems that the queue-resource configuration would be similar to FairScheduler with this feature. Is it possible that with the same configuration file, we can choose either FS or CS for scheduling? > Enable configuration of queue capacity in terms of absolute resources > - > > Key: YARN-5881 > URL: https://issues.apache.org/jira/browse/YARN-5881 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Sean Po >Assignee: Wangda Tan > Attachments: > YARN-5881.Support.Absolute.Min.Max.Resource.In.Capacity.Scheduler.design-doc.v1.pdf > > > Currently, Yarn RM supports the configuration of queue capacity in terms of a > proportion to cluster capacity. In the context of Yarn being used as a public > cloud service, it makes more sense if queues can be configured absolutely. > This will allow administrators to set usage limits more concretely and > simplify customer expectations for cluster allocation. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6042) Dump scheduler and queue state information into FairScheduler DEBUG log
[ https://issues.apache.org/jira/browse/YARN-6042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893880#comment-15893880 ] Tao Jie commented on YARN-6042: --- Hi [~yufeigu], dumping scheduler/queue state is very useful to detect scheduling problem at run-time. It seems to me that you try write scheduler/queue information to log file. How about print this information on the webui, just like we can get server stacks by a link. > Dump scheduler and queue state information into FairScheduler DEBUG log > --- > > Key: YARN-6042 > URL: https://issues.apache.org/jira/browse/YARN-6042 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: YARN-6042.001.patch, YARN-6042.002.patch, > YARN-6042.003.patch, YARN-6042.004.patch, YARN-6042.005.patch, > YARN-6042.006.patch, YARN-6042.007.patch, YARN-6042.008.patch > > > To improve the debugging of scheduler issues it would be a big improvement to > be able to dump the scheduler state into a log on request. > The Dump the scheduler state at a point in time would allow debugging of a > scheduler that is not hung (deadlocked) but also not assigning containers. > Currently we do not have a proper overview of what state the scheduler and > the queues are in and we have to make assumptions or guess > The scheduler and queue state needed would include (not exhaustive): > - instantaneous and steady fair share (app / queue) > - AM share and resources > - weight > - app demand > - application run state (runnable/non runnable) > - last time at fair/min share -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6249) TestFairSchedulerPreemption is inconsistently failing on trunk
[ https://issues.apache.org/jira/browse/YARN-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893722#comment-15893722 ] Tao Jie commented on YARN-6249: --- Updated the patch for [~yufeigu]'s comments. And ran 200 times again without failure. > TestFairSchedulerPreemption is inconsistently failing on trunk > -- > > Key: YARN-6249 > URL: https://issues.apache.org/jira/browse/YARN-6249 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 2.9.0 >Reporter: Sean Po >Assignee: Tao Jie > Attachments: YARN-6249.001.patch, YARN-6249.002.patch > > > Tests in TestFairSchedulerPreemption.java will inconsistently fail on trunk. > An example stack trace: > {noformat} > Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.879 sec > <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption > testPreemptionSelectNonAMContainer[MinSharePreemptionWithDRF](org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption) > Time elapsed: 10.475 sec <<< FAILURE! > java.lang.AssertionError: Incorrect number of containers on the greedy app > expected:<4> but was:<8> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:288) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionSelectNonAMContainer(TestFairSchedulerPreemption.java:363) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6249) TestFairSchedulerPreemption is inconsistently failing on trunk
[ https://issues.apache.org/jira/browse/YARN-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-6249: -- Attachment: YARN-6249.002.patch > TestFairSchedulerPreemption is inconsistently failing on trunk > -- > > Key: YARN-6249 > URL: https://issues.apache.org/jira/browse/YARN-6249 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 2.9.0 >Reporter: Sean Po >Assignee: Tao Jie > Attachments: YARN-6249.001.patch, YARN-6249.002.patch > > > Tests in TestFairSchedulerPreemption.java will inconsistently fail on trunk. > An example stack trace: > {noformat} > Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.879 sec > <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption > testPreemptionSelectNonAMContainer[MinSharePreemptionWithDRF](org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption) > Time elapsed: 10.475 sec <<< FAILURE! > java.lang.AssertionError: Incorrect number of containers on the greedy app > expected:<4> but was:<8> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:288) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionSelectNonAMContainer(TestFairSchedulerPreemption.java:363) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6249) TestFairSchedulerPreemption is inconsistently failing on trunk
[ https://issues.apache.org/jira/browse/YARN-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie reassigned YARN-6249: - Assignee: Tao Jie (was: Yufei Gu) > TestFairSchedulerPreemption is inconsistently failing on trunk > -- > > Key: YARN-6249 > URL: https://issues.apache.org/jira/browse/YARN-6249 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 2.9.0 >Reporter: Sean Po >Assignee: Tao Jie > Attachments: YARN-6249.001.patch > > > Tests in TestFairSchedulerPreemption.java will inconsistently fail on trunk. > An example stack trace: > {noformat} > Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.879 sec > <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption > testPreemptionSelectNonAMContainer[MinSharePreemptionWithDRF](org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption) > Time elapsed: 10.475 sec <<< FAILURE! > java.lang.AssertionError: Incorrect number of containers on the greedy app > expected:<4> but was:<8> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:288) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionSelectNonAMContainer(TestFairSchedulerPreemption.java:363) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6249) TestFairSchedulerPreemption is inconsistently failing on trunk
[ https://issues.apache.org/jira/browse/YARN-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893644#comment-15893644 ] Tao Jie commented on YARN-6249: --- Thank you [~yufeigu] [~miklos.szeg...@cloudera.com] for your reply! {quote} Would it make sense to initialize control clock before set it to scheduler like this? {quote} Agree! It makes this test more close to the real world. > TestFairSchedulerPreemption is inconsistently failing on trunk > -- > > Key: YARN-6249 > URL: https://issues.apache.org/jira/browse/YARN-6249 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 2.9.0 >Reporter: Sean Po >Assignee: Yufei Gu > Attachments: YARN-6249.001.patch > > > Tests in TestFairSchedulerPreemption.java will inconsistently fail on trunk. > An example stack trace: > {noformat} > Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.879 sec > <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption > testPreemptionSelectNonAMContainer[MinSharePreemptionWithDRF](org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption) > Time elapsed: 10.475 sec <<< FAILURE! > java.lang.AssertionError: Incorrect number of containers on the greedy app > expected:<4> but was:<8> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:288) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionSelectNonAMContainer(TestFairSchedulerPreemption.java:363) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6249) TestFairSchedulerPreemption is inconsistently failing on trunk
[ https://issues.apache.org/jira/browse/YARN-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892327#comment-15892327 ] Tao Jie edited comment on YARN-6249 at 3/2/17 2:32 PM: --- In the attached patch, I called update() explicitly between app1 is allocated and app2 is submitted, to ensure {{minShareStarvation}} of root.preemptable.child-2 is refreshed. [~yufeigu] [~kasha], would you take a look at it? I ran 300 times of this case no failure, and 3 of 100 runs would failed without this patch. was (Author: tao jie): In the attached patch, I called update() explicitly between app1 is allocated and before app2 is submitted, to ensure {{minShareStarvation}} of root.preemptable.child-2 is refreshed. [~yufeigu] [~kasha], would you take a look at it? I ran 300 times of this case no failure, and 3 of 100 runs would failed without this patch. > TestFairSchedulerPreemption is inconsistently failing on trunk > -- > > Key: YARN-6249 > URL: https://issues.apache.org/jira/browse/YARN-6249 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 2.9.0 >Reporter: Sean Po >Assignee: Yufei Gu > Attachments: YARN-6249.001.patch > > > Tests in TestFairSchedulerPreemption.java will inconsistently fail on trunk. > An example stack trace: > {noformat} > Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.879 sec > <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption > testPreemptionSelectNonAMContainer[MinSharePreemptionWithDRF](org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption) > Time elapsed: 10.475 sec <<< FAILURE! > java.lang.AssertionError: Incorrect number of containers on the greedy app > expected:<4> but was:<8> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:288) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionSelectNonAMContainer(TestFairSchedulerPreemption.java:363) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6249) TestFairSchedulerPreemption is inconsistently failing on trunk
[ https://issues.apache.org/jira/browse/YARN-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892327#comment-15892327 ] Tao Jie commented on YARN-6249: --- In the attached patch, I called update() explicitly between app1 is allocated and before app2 is submitted, to ensure {{minShareStarvation}} of root.preemptable.child-2 is refreshed. [~yufeigu] [~kasha], would you take a look at it? I ran 300 times of this case no failure, and 3 of 100 runs would failed without this patch. > TestFairSchedulerPreemption is inconsistently failing on trunk > -- > > Key: YARN-6249 > URL: https://issues.apache.org/jira/browse/YARN-6249 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 2.9.0 >Reporter: Sean Po >Assignee: Yufei Gu > Attachments: YARN-6249.001.patch > > > Tests in TestFairSchedulerPreemption.java will inconsistently fail on trunk. > An example stack trace: > {noformat} > Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.879 sec > <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption > testPreemptionSelectNonAMContainer[MinSharePreemptionWithDRF](org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption) > Time elapsed: 10.475 sec <<< FAILURE! > java.lang.AssertionError: Incorrect number of containers on the greedy app > expected:<4> but was:<8> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:288) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionSelectNonAMContainer(TestFairSchedulerPreemption.java:363) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6249) TestFairSchedulerPreemption is inconsistently failing on trunk
[ https://issues.apache.org/jira/browse/YARN-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-6249: -- Attachment: YARN-6249.001.patch > TestFairSchedulerPreemption is inconsistently failing on trunk > -- > > Key: YARN-6249 > URL: https://issues.apache.org/jira/browse/YARN-6249 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 2.9.0 >Reporter: Sean Po >Assignee: Yufei Gu > Attachments: YARN-6249.001.patch > > > Tests in TestFairSchedulerPreemption.java will inconsistently fail on trunk. > An example stack trace: > {noformat} > Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.879 sec > <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption > testPreemptionSelectNonAMContainer[MinSharePreemptionWithDRF](org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption) > Time elapsed: 10.475 sec <<< FAILURE! > java.lang.AssertionError: Incorrect number of containers on the greedy app > expected:<4> but was:<8> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:288) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionSelectNonAMContainer(TestFairSchedulerPreemption.java:363) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6249) TestFairSchedulerPreemption is inconsistently failing on trunk
[ https://issues.apache.org/jira/browse/YARN-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892305#comment-15892305 ] Tao Jie commented on YARN-6249: --- I debugged this test and detected the root cause of the failure. In the test, FsLeafQueues are initialized before {{scheduler.setClock(clock)}} is called in setup(). As a result, {{lastTimeAtMinShare}} in FsLeafQueue is initialized to the long value of current time(a big number), and it will compare to the time of {{ControlledClock}} which starts from 0. In {{FsLeafQueue#minShareStarvation}} invoked in update() {code} long now = scheduler.getClock().getTime(); if (!starved) { // Record that the queue is not starved setLastTimeAtMinShare(now); } if (now - lastTimeAtMinShare < getMinSharePreemptionTimeout()) { // the queue is not starved for the preemption timeout starvation = Resources.clone(Resources.none()); } {code} If {{starved}} is true here at the first time this method is called, this queue would never satisfy the min preemption timeout. However I don't think it is a bug in the real world, because this issue is related to ControlledClock only used in test. > TestFairSchedulerPreemption is inconsistently failing on trunk > -- > > Key: YARN-6249 > URL: https://issues.apache.org/jira/browse/YARN-6249 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 2.9.0 >Reporter: Sean Po >Assignee: Yufei Gu > > Tests in TestFairSchedulerPreemption.java will inconsistently fail on trunk. > An example stack trace: > {noformat} > Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.879 sec > <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption > testPreemptionSelectNonAMContainer[MinSharePreemptionWithDRF](org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption) > Time elapsed: 10.475 sec <<< FAILURE! > java.lang.AssertionError: Incorrect number of containers on the greedy app > expected:<4> but was:<8> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:288) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionSelectNonAMContainer(TestFairSchedulerPreemption.java:363) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6236) Move lock() out of try-block in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15885267#comment-15885267 ] Tao Jie commented on YARN-6236: --- Checked files in fair folder by {{grep -R "Lock.lock()" -A 1 -B 1 ./}} and uploaded the patch. [~kasha], would you give it a review? > Move lock() out of try-block in FairScheduler > - > > Key: YARN-6236 > URL: https://issues.apache.org/jira/browse/YARN-6236 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Tao Jie >Assignee: Tao Jie > Attachments: YARN-6236.001.patch > > > As discussed in YARN-6215, {{read/writeLock.lock()}} inside the try-block is > widely used in existing code, especially in FairScheduler.java, eg: > {code} > public ResourceWeights getAppWeight(FSAppAttempt app) { > try { > readLock.lock(); > ... > ... > return resourceWeights; > } finally { > readLock.unlock(); > } > } > {code} > However in the best practice, {{lock()}} should be called outside of the > try-block. In case of exception happens on {{lock()}} itself, {{unlock()}} in > finally should not be invoked. > We'd better to move {{lock()}} out of try-block. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6236) Move lock() out of try-block in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-6236: -- Component/s: fairscheduler > Move lock() out of try-block in FairScheduler > - > > Key: YARN-6236 > URL: https://issues.apache.org/jira/browse/YARN-6236 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Minor > Attachments: YARN-6236.001.patch > > > As discussed in YARN-6215, {{read/writeLock.lock()}} inside the try-block is > widely used in existing code, especially in FairScheduler.java, eg: > {code} > public ResourceWeights getAppWeight(FSAppAttempt app) { > try { > readLock.lock(); > ... > ... > return resourceWeights; > } finally { > readLock.unlock(); > } > } > {code} > However in the best practice, {{lock()}} should be called outside of the > try-block. In case of exception happens on {{lock()}} itself, {{unlock()}} in > finally should not be invoked. > We'd better to move {{lock()}} out of try-block. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6236) Move lock() out of try-block in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-6236: -- Priority: Minor (was: Major) > Move lock() out of try-block in FairScheduler > - > > Key: YARN-6236 > URL: https://issues.apache.org/jira/browse/YARN-6236 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Minor > Attachments: YARN-6236.001.patch > > > As discussed in YARN-6215, {{read/writeLock.lock()}} inside the try-block is > widely used in existing code, especially in FairScheduler.java, eg: > {code} > public ResourceWeights getAppWeight(FSAppAttempt app) { > try { > readLock.lock(); > ... > ... > return resourceWeights; > } finally { > readLock.unlock(); > } > } > {code} > However in the best practice, {{lock()}} should be called outside of the > try-block. In case of exception happens on {{lock()}} itself, {{unlock()}} in > finally should not be invoked. > We'd better to move {{lock()}} out of try-block. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6236) Move lock() out of try-block in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-6236: -- Attachment: YARN-6236.001.patch > Move lock() out of try-block in FairScheduler > - > > Key: YARN-6236 > URL: https://issues.apache.org/jira/browse/YARN-6236 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Tao Jie >Assignee: Tao Jie > Attachments: YARN-6236.001.patch > > > As discussed in YARN-6215, {{read/writeLock.lock()}} inside the try-block is > widely used in existing code, especially in FairScheduler.java, eg: > {code} > public ResourceWeights getAppWeight(FSAppAttempt app) { > try { > readLock.lock(); > ... > ... > return resourceWeights; > } finally { > readLock.unlock(); > } > } > {code} > However in the best practice, {{lock()}} should be called outside of the > try-block. In case of exception happens on {{lock()}} itself, {{unlock()}} in > finally should not be invoked. > We'd better to move {{lock()}} out of try-block. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6236) Move lock() out of try-block in FairScheduler
Tao Jie created YARN-6236: - Summary: Move lock() out of try-block in FairScheduler Key: YARN-6236 URL: https://issues.apache.org/jira/browse/YARN-6236 Project: Hadoop YARN Issue Type: Bug Reporter: Tao Jie Assignee: Tao Jie As discussed in YARN-6215, {{read/writeLock.lock()}} inside the try-block is widely used in existing code, especially in FairScheduler.java, eg: {code} public ResourceWeights getAppWeight(FSAppAttempt app) { try { readLock.lock(); ... ... return resourceWeights; } finally { readLock.unlock(); } } {code} However in the best practice, {{lock()}} should be called outside of the try-block. In case of exception happens on {{lock()}} itself, {{unlock()}} in finally should not be invoked. We'd better to move {{lock()}} out of try-block. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6215) TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in trunk
[ https://issues.apache.org/jira/browse/YARN-6215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15885080#comment-15885080 ] Tao Jie commented on YARN-6215: --- [~kasha] thank you for you comments and patch updated. {quote} lock() should be called outside the try-block {quote} I didn't think much about it before. {{lock()}} inside and outside the try-block both exist in current code. I checked some discussions at stackOverFlow, {{lock()}} itself does not throw checked exception but in case it throws unchecked exception(maybe it would hardly happen), {{unlock()}} should not be invoked. So {{lock()}} outside try-block is the better practice. Is it necessary to move existing {{lock()}} outside try-block? At least in {{FairScheduler.java}} most {{lock()}} is inside try-block now. > TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in > trunk > > > Key: YARN-6215 > URL: https://issues.apache.org/jira/browse/YARN-6215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler, test >Reporter: Sunil G >Assignee: Tao Jie > Attachments: YARN-6215.001.patch, YARN-6215.002.patch > > > *Error Message* > Incorrect number of containers on the greedy app expected:<4> but was:<8> > Failed test case > [link|https://builds.apache.org/job/PreCommit-YARN-Build/15038/testReport/org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair/TestFairSchedulerPreemption/testPreemptionBetweenNonSiblingQueues_FairSharePreemptionWithDRF_/] > *Stacktrace* > {noformat} > java.lang.AssertionError: Incorrect number of containers on the greedy app > expected:<4> but was:<8> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:282) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues(TestFairSchedulerPreemption.java:323) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6215) TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in trunk
[ https://issues.apache.org/jira/browse/YARN-6215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-6215: -- Attachment: (was: YARN-6215.002.patch) > TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in > trunk > > > Key: YARN-6215 > URL: https://issues.apache.org/jira/browse/YARN-6215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler, test >Reporter: Sunil G >Assignee: Tao Jie > Attachments: YARN-6215.001.patch, YARN-6215.002.patch > > > *Error Message* > Incorrect number of containers on the greedy app expected:<4> but was:<8> > Failed test case > [link|https://builds.apache.org/job/PreCommit-YARN-Build/15038/testReport/org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair/TestFairSchedulerPreemption/testPreemptionBetweenNonSiblingQueues_FairSharePreemptionWithDRF_/] > *Stacktrace* > {noformat} > java.lang.AssertionError: Incorrect number of containers on the greedy app > expected:<4> but was:<8> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:282) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues(TestFairSchedulerPreemption.java:323) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6215) TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in trunk
[ https://issues.apache.org/jira/browse/YARN-6215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-6215: -- Attachment: YARN-6215.002.patch > TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in > trunk > > > Key: YARN-6215 > URL: https://issues.apache.org/jira/browse/YARN-6215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler, test >Reporter: Sunil G >Assignee: Tao Jie > Attachments: YARN-6215.001.patch, YARN-6215.002.patch > > > *Error Message* > Incorrect number of containers on the greedy app expected:<4> but was:<8> > Failed test case > [link|https://builds.apache.org/job/PreCommit-YARN-Build/15038/testReport/org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair/TestFairSchedulerPreemption/testPreemptionBetweenNonSiblingQueues_FairSharePreemptionWithDRF_/] > *Stacktrace* > {noformat} > java.lang.AssertionError: Incorrect number of containers on the greedy app > expected:<4> but was:<8> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:282) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues(TestFairSchedulerPreemption.java:323) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6215) TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in trunk
[ https://issues.apache.org/jira/browse/YARN-6215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-6215: -- Attachment: YARN-6215.002.patch > TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in > trunk > > > Key: YARN-6215 > URL: https://issues.apache.org/jira/browse/YARN-6215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler, test >Reporter: Sunil G >Assignee: Tao Jie > Attachments: YARN-6215.001.patch, YARN-6215.002.patch > > > *Error Message* > Incorrect number of containers on the greedy app expected:<4> but was:<8> > Failed test case > [link|https://builds.apache.org/job/PreCommit-YARN-Build/15038/testReport/org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair/TestFairSchedulerPreemption/testPreemptionBetweenNonSiblingQueues_FairSharePreemptionWithDRF_/] > *Stacktrace* > {noformat} > java.lang.AssertionError: Incorrect number of containers on the greedy app > expected:<4> but was:<8> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:282) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues(TestFairSchedulerPreemption.java:323) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6215) TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in trunk
[ https://issues.apache.org/jira/browse/YARN-6215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881721#comment-15881721 ] Tao Jie commented on YARN-6215: --- [~kasha] thank you for you comment! I understand your concern. It may bring some performance loss if we add a read-lock, but it is risky if we don't. Doing preemption when fairshare of queues is incomplete would not only miss containers that should be preempted, but also may preempt containers that should not be preempted by mistake. It would be unpredictable. In earlier FS code, updating fairshare and preempting containers took place in one thread, so I think a readlock here would not make the performance worse. > TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in > trunk > > > Key: YARN-6215 > URL: https://issues.apache.org/jira/browse/YARN-6215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler, test >Reporter: Sunil G >Assignee: Tao Jie > Attachments: YARN-6215.001.patch > > > *Error Message* > Incorrect number of containers on the greedy app expected:<4> but was:<8> > Failed test case > [link|https://builds.apache.org/job/PreCommit-YARN-Build/15038/testReport/org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair/TestFairSchedulerPreemption/testPreemptionBetweenNonSiblingQueues_FairSharePreemptionWithDRF_/] > *Stacktrace* > {noformat} > java.lang.AssertionError: Incorrect number of containers on the greedy app > expected:<4> but was:<8> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:282) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues(TestFairSchedulerPreemption.java:323) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6215) TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in trunk
[ https://issues.apache.org/jira/browse/YARN-6215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880886#comment-15880886 ] Tao Jie commented on YARN-6215: --- [~kasha] [~yufeigu] [~sunilg], I uploaded a patch that add a readlock on FsPreemptionThread. Would you mind giving it a review? > TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in > trunk > > > Key: YARN-6215 > URL: https://issues.apache.org/jira/browse/YARN-6215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler, test >Reporter: Sunil G >Assignee: Tao Jie > Attachments: YARN-6215.001.patch > > > *Error Message* > Incorrect number of containers on the greedy app expected:<4> but was:<8> > Failed test case > [link|https://builds.apache.org/job/PreCommit-YARN-Build/15038/testReport/org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair/TestFairSchedulerPreemption/testPreemptionBetweenNonSiblingQueues_FairSharePreemptionWithDRF_/] > *Stacktrace* > {noformat} > java.lang.AssertionError: Incorrect number of containers on the greedy app > expected:<4> but was:<8> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:282) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues(TestFairSchedulerPreemption.java:323) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6215) TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in trunk
[ https://issues.apache.org/jira/browse/YARN-6215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-6215: -- Attachment: YARN-6215.001.patch > TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in > trunk > > > Key: YARN-6215 > URL: https://issues.apache.org/jira/browse/YARN-6215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler, test >Reporter: Sunil G >Assignee: Tao Jie > Attachments: YARN-6215.001.patch > > > *Error Message* > Incorrect number of containers on the greedy app expected:<4> but was:<8> > Failed test case > [link|https://builds.apache.org/job/PreCommit-YARN-Build/15038/testReport/org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair/TestFairSchedulerPreemption/testPreemptionBetweenNonSiblingQueues_FairSharePreemptionWithDRF_/] > *Stacktrace* > {noformat} > java.lang.AssertionError: Incorrect number of containers on the greedy app > expected:<4> but was:<8> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:282) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues(TestFairSchedulerPreemption.java:323) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6215) TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in trunk
[ https://issues.apache.org/jira/browse/YARN-6215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880872#comment-15880872 ] Tao Jie commented on YARN-6215: --- I debugged this unittest and found out it is because of reentry of updateThread and preemptionThread. In updateThread, it goes through all queues and trigger the preemptionThread once it finds a app is starved. At this moment, a few queues has been updated while others are not. And the preemptionThread will try to find container to preempt in the incomplete state. Today updateThread is under writeLock of Fairscheduler, as a result, we need to add a readLock of FS on the preemptionThread at the same time. > TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in > trunk > > > Key: YARN-6215 > URL: https://issues.apache.org/jira/browse/YARN-6215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler, test >Reporter: Sunil G >Assignee: Tao Jie > > *Error Message* > Incorrect number of containers on the greedy app expected:<4> but was:<8> > Failed test case > [link|https://builds.apache.org/job/PreCommit-YARN-Build/15038/testReport/org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair/TestFairSchedulerPreemption/testPreemptionBetweenNonSiblingQueues_FairSharePreemptionWithDRF_/] > *Stacktrace* > {noformat} > java.lang.AssertionError: Incorrect number of containers on the greedy app > expected:<4> but was:<8> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:282) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues(TestFairSchedulerPreemption.java:323) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6215) TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in trunk
[ https://issues.apache.org/jira/browse/YARN-6215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie reassigned YARN-6215: - Assignee: Tao Jie > TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in > trunk > > > Key: YARN-6215 > URL: https://issues.apache.org/jira/browse/YARN-6215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler, test >Reporter: Sunil G >Assignee: Tao Jie > > *Error Message* > Incorrect number of containers on the greedy app expected:<4> but was:<8> > Failed test case > [link|https://builds.apache.org/job/PreCommit-YARN-Build/15038/testReport/org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair/TestFairSchedulerPreemption/testPreemptionBetweenNonSiblingQueues_FairSharePreemptionWithDRF_/] > *Stacktrace* > {noformat} > java.lang.AssertionError: Incorrect number of containers on the greedy app > expected:<4> but was:<8> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:282) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues(TestFairSchedulerPreemption.java:323) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6225) Global scheduler applies to Fair scheduler
[ https://issues.apache.org/jira/browse/YARN-6225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-6225: -- Summary: Global scheduler applies to Fair scheduler (was: Global scheduler apply to Fair scheduler) > Global scheduler applies to Fair scheduler > -- > > Key: YARN-6225 > URL: https://issues.apache.org/jira/browse/YARN-6225 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Tao Jie > > IIRC in global scheduling, logic for scheduling constraint such as nodelabel, > affinity/anti-affinity would take place before the scheduler try to commit > ResourceCommitRequest. This logic looks can be shared by FairScheduler and > CapacityScheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6225) Global scheduler apply to Fair scheduler
[ https://issues.apache.org/jira/browse/YARN-6225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie reassigned YARN-6225: - Assignee: Tao Jie > Global scheduler apply to Fair scheduler > > > Key: YARN-6225 > URL: https://issues.apache.org/jira/browse/YARN-6225 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Tao Jie >Assignee: Tao Jie > > IIRC in global scheduling, logic for scheduling constraint such as nodelabel, > affinity/anti-affinity would take place before the scheduler try to commit > ResourceCommitRequest. This logic looks can be shared by FairScheduler and > CapacityScheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6225) Global scheduler apply to Fair scheduler
[ https://issues.apache.org/jira/browse/YARN-6225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie reassigned YARN-6225: - Assignee: (was: Tao Jie) > Global scheduler apply to Fair scheduler > > > Key: YARN-6225 > URL: https://issues.apache.org/jira/browse/YARN-6225 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Tao Jie > > IIRC in global scheduling, logic for scheduling constraint such as nodelabel, > affinity/anti-affinity would take place before the scheduler try to commit > ResourceCommitRequest. This logic looks can be shared by FairScheduler and > CapacityScheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6225) Global scheduler apply to Fair scheduler
Tao Jie created YARN-6225: - Summary: Global scheduler apply to Fair scheduler Key: YARN-6225 URL: https://issues.apache.org/jira/browse/YARN-6225 Project: Hadoop YARN Issue Type: Sub-task Reporter: Tao Jie IIRC in global scheduling, logic for scheduling constraint such as nodelabel, affinity/anti-affinity would take place before the scheduler try to commit ResourceCommitRequest. This logic looks can be shared by FairScheduler and CapacityScheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6224) Should consider utilization of each ResourceType on node while scheduling
Tao Jie created YARN-6224: - Summary: Should consider utilization of each ResourceType on node while scheduling Key: YARN-6224 URL: https://issues.apache.org/jira/browse/YARN-6224 Project: Hadoop YARN Issue Type: Sub-task Reporter: Tao Jie In situation like YARN-6101, if we consider all type of resource(vcore, memory) utilization on node rather than just answer we can allocate or not, we are more likely to have better resource utilization as a whole. It is possible that we have a set of candidate nodes, then find the most promising node to assign to one request considering node resource utilization with global scheduling. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6101) Delay scheduling for node resource balance
[ https://issues.apache.org/jira/browse/YARN-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877833#comment-15877833 ] Tao Jie commented on YARN-6101: --- [~He Tianyi], thank you for sharing your case. Today scheduling is triggered by NM heartbeat, that is once one NM come, the scheduler select containers to assign to this NM. It is difficult to find the global best node to run container for applications. It seems that YARN-5139 improves the scheduling logic, which is first we find a set of candidate nodes for each resource request, then we have NodeScorer to measure which node is the best to allocate. In this case, node's utilization should be considered. > Delay scheduling for node resource balance > -- > > Key: YARN-6101 > URL: https://issues.apache.org/jira/browse/YARN-6101 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: He Tianyi >Priority: Minor > Attachments: YARN-6101.preliminary..patch > > > We observed that, in today's cluster, usage of Spark has dramatically > increased. > This introduced a new issue that CPU/MEM utilization for single node may > become imbalanced due to Spark is generally more memory intensive. For > example, after a node with capability (48 cores, 192 GB memory) cannot > satisfy a (1 core, 2 GB memory) request if current used resource is (20 > cores, 191 GB memory), with plenty of total available resource across the > whole cluster. > A thought for avoiding the situation is to introduce some strategy during > scheduling. > This JIRA proposes a delay-scheduling-alike approach to achieve better > balance between different type of resources on each node. > The basic idea is consider dominant resource for each node, and when a > scheduling opportunity on a particular node is offered to a resource request, > better make sure the allocation is changing dominant resource of the node, > or, in worst case, allocate at once when number of offered scheduling > opportunities exceeds a certain number. > With YARN SLS and a simulation file with hybrid workload (MR+Spark), the > approach improved cluster resource usage by nearly 5%. And after deployed to > production, we observed a 8% improvement. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5829) FS preemption should reserve a node before considering containers on it for preemption
[ https://issues.apache.org/jira/browse/YARN-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871054#comment-15871054 ] Tao Jie commented on YARN-5829: --- [~kasha], it seems similar situation as mentioned in YARN-5636. Should we have a common mechanism that supports "reserving certain resource on a certain node for a certain app for a while" ? > FS preemption should reserve a node before considering containers on it for > preemption > -- > > Key: YARN-5829 > URL: https://issues.apache.org/jira/browse/YARN-5829 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > > FS preemption evaluates nodes for preemption, and subsequently preempts > identified containers. If this node is not reserved for a specific > application, any other application could be allocated resources on this node. > Reserving the node for the starved application before preempting containers > would help avoid this. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2497) Changes for fair scheduler to support allocate resource respect labels
[ https://issues.apache.org/jira/browse/YARN-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869650#comment-15869650 ] Tao Jie commented on YARN-2497: --- We already have a implement of supporting nodelabel for FS but it is based on earlier hadoop version. I would like to rebase the patch since preemption logic for fairschuduler has been refactored in YARN-4752. [~kasha], would you mind if I take this JIRA over? > Changes for fair scheduler to support allocate resource respect labels > -- > > Key: YARN-2497 > URL: https://issues.apache.org/jira/browse/YARN-2497 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Wangda Tan > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6000) Set modifier of interface Listener in AllocationFileLoaderService to public
[ https://issues.apache.org/jira/browse/YARN-6000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749979#comment-15749979 ] Tao Jie commented on YARN-6000: --- [~templedf], [~kasha] would you mind taking a look at it? > Set modifier of interface Listener in AllocationFileLoaderService to public > --- > > Key: YARN-6000 > URL: https://issues.apache.org/jira/browse/YARN-6000 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, yarn >Affects Versions: 3.0.0-alpha1 >Reporter: Tao Jie >Assignee: Tao Jie > Attachments: YARN-6000.001.patch > > > We removed public modifier of {{AllocationFileLoaderService.Listener}} in > YARN-4997 since it trigger a findbugs warning. However it breaks Hive code in > {{FairSchedulerShim}}. > {code} > AllocationFileLoaderService allocsLoader = new AllocationFileLoaderService(); > allocsLoader.init(conf); > allocsLoader.setReloadListener(new AllocationFileLoaderService.Listener() > { > @Override > public void onReload(AllocationConfiguration allocs) { > allocConf.set(allocs); > } > }); > try { > allocsLoader.reloadAllocations(); > } catch (Exception ex) { > throw new IOException("Failed to load queue allocations", ex); > } > if (allocConf.get() == null) { > allocConf.set(new AllocationConfiguration(conf)); > } > QueuePlacementPolicy queuePolicy = allocConf.get().getPlacementPolicy(); > if (queuePolicy != null) { > requestedQueue = queuePolicy.assignAppToQueue(requestedQueue, userName); > {code} > As a result we should set the modifier back to public. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6000) Set modifier of interface Listener in AllocationFileLoaderService to public
[ https://issues.apache.org/jira/browse/YARN-6000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-6000: -- Attachment: YARN-6000.001.patch > Set modifier of interface Listener in AllocationFileLoaderService to public > --- > > Key: YARN-6000 > URL: https://issues.apache.org/jira/browse/YARN-6000 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, yarn >Affects Versions: 3.0.0-alpha1 >Reporter: Tao Jie >Assignee: Tao Jie > Attachments: YARN-6000.001.patch > > > We removed public modifier of {{AllocationFileLoaderService.Listener}} in > YARN-4997 since it trigger a findbugs warning. However it breaks Hive code in > {{FairSchedulerShim}}. > {code} > AllocationFileLoaderService allocsLoader = new AllocationFileLoaderService(); > allocsLoader.init(conf); > allocsLoader.setReloadListener(new AllocationFileLoaderService.Listener() > { > @Override > public void onReload(AllocationConfiguration allocs) { > allocConf.set(allocs); > } > }); > try { > allocsLoader.reloadAllocations(); > } catch (Exception ex) { > throw new IOException("Failed to load queue allocations", ex); > } > if (allocConf.get() == null) { > allocConf.set(new AllocationConfiguration(conf)); > } > QueuePlacementPolicy queuePolicy = allocConf.get().getPlacementPolicy(); > if (queuePolicy != null) { > requestedQueue = queuePolicy.assignAppToQueue(requestedQueue, userName); > {code} > As a result we should set the modifier back to public. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4997) Update fair scheduler to use pluggable auth provider
[ https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746988#comment-15746988 ] Tao Jie commented on YARN-4997: --- Thank you [~sershe], I have created another JIRA YARN-6000 to handle this. It's OK if you change Hive code to walk around and make the logic more clear. However once it break the code, we should get it fixed. Otherwise, when we update the Hadoop version, but not Hive(maybe have not released yet), it would fail. > Update fair scheduler to use pluggable auth provider > > > Key: YARN-4997 > URL: https://issues.apache.org/jira/browse/YARN-4997 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.8.0 >Reporter: Daniel Templeton >Assignee: Tao Jie > Fix For: 3.0.0-alpha2 > > Attachments: YARN-4997-001.patch, YARN-4997-002.patch, > YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, > YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, > YARN-4997-009.patch, YARN-4997-010.patch, YARN-4997-011.patch > > > Now that YARN-3100 has made the authorization pluggable, it should be > supported by the fair scheduler. YARN-3100 only updated the capacity > scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6000) Set modifier of interface Listener in AllocationFileLoaderService to public
[ https://issues.apache.org/jira/browse/YARN-6000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie reassigned YARN-6000: - Assignee: Tao Jie > Set modifier of interface Listener in AllocationFileLoaderService to public > --- > > Key: YARN-6000 > URL: https://issues.apache.org/jira/browse/YARN-6000 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, yarn >Affects Versions: 3.0.0-alpha1 >Reporter: Tao Jie >Assignee: Tao Jie > > We removed public modifier of {{AllocationFileLoaderService.Listener}} in > YARN-4997 since it trigger a findbugs warning. However it breaks Hive code in > {{FairSchedulerShim}}. > {code} > AllocationFileLoaderService allocsLoader = new AllocationFileLoaderService(); > allocsLoader.init(conf); > allocsLoader.setReloadListener(new AllocationFileLoaderService.Listener() > { > @Override > public void onReload(AllocationConfiguration allocs) { > allocConf.set(allocs); > } > }); > try { > allocsLoader.reloadAllocations(); > } catch (Exception ex) { > throw new IOException("Failed to load queue allocations", ex); > } > if (allocConf.get() == null) { > allocConf.set(new AllocationConfiguration(conf)); > } > QueuePlacementPolicy queuePolicy = allocConf.get().getPlacementPolicy(); > if (queuePolicy != null) { > requestedQueue = queuePolicy.assignAppToQueue(requestedQueue, userName); > {code} > As a result we should set the modifier back to public. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6000) Set modifier of interface Listener in AllocationFileLoaderService to public
[ https://issues.apache.org/jira/browse/YARN-6000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-6000: -- Component/s: yarn fairscheduler > Set modifier of interface Listener in AllocationFileLoaderService to public > --- > > Key: YARN-6000 > URL: https://issues.apache.org/jira/browse/YARN-6000 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, yarn >Affects Versions: 3.0.0-alpha1 >Reporter: Tao Jie > > We removed public modifier of {{AllocationFileLoaderService.Listener}} in > YARN-4997 since it trigger a findbugs warning. However it breaks Hive code in > {{FairSchedulerShim}}. > {code} > AllocationFileLoaderService allocsLoader = new AllocationFileLoaderService(); > allocsLoader.init(conf); > allocsLoader.setReloadListener(new AllocationFileLoaderService.Listener() > { > @Override > public void onReload(AllocationConfiguration allocs) { > allocConf.set(allocs); > } > }); > try { > allocsLoader.reloadAllocations(); > } catch (Exception ex) { > throw new IOException("Failed to load queue allocations", ex); > } > if (allocConf.get() == null) { > allocConf.set(new AllocationConfiguration(conf)); > } > QueuePlacementPolicy queuePolicy = allocConf.get().getPlacementPolicy(); > if (queuePolicy != null) { > requestedQueue = queuePolicy.assignAppToQueue(requestedQueue, userName); > {code} > As a result we should set the modifier back to public. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6000) Set modifier of interface Listener in AllocationFileLoaderService to public
Tao Jie created YARN-6000: - Summary: Set modifier of interface Listener in AllocationFileLoaderService to public Key: YARN-6000 URL: https://issues.apache.org/jira/browse/YARN-6000 Project: Hadoop YARN Issue Type: Bug Reporter: Tao Jie We removed public modifier of {{AllocationFileLoaderService.Listener}} in YARN-4997 since it trigger a findbugs warning. However it breaks Hive code in {{FairSchedulerShim}}. {code} AllocationFileLoaderService allocsLoader = new AllocationFileLoaderService(); allocsLoader.init(conf); allocsLoader.setReloadListener(new AllocationFileLoaderService.Listener() { @Override public void onReload(AllocationConfiguration allocs) { allocConf.set(allocs); } }); try { allocsLoader.reloadAllocations(); } catch (Exception ex) { throw new IOException("Failed to load queue allocations", ex); } if (allocConf.get() == null) { allocConf.set(new AllocationConfiguration(conf)); } QueuePlacementPolicy queuePolicy = allocConf.get().getPlacementPolicy(); if (queuePolicy != null) { requestedQueue = queuePolicy.assignAppToQueue(requestedQueue, userName); {code} As a result we should set the modifier back to public. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6000) Set modifier of interface Listener in AllocationFileLoaderService to public
[ https://issues.apache.org/jira/browse/YARN-6000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-6000: -- Affects Version/s: 3.0.0-alpha1 > Set modifier of interface Listener in AllocationFileLoaderService to public > --- > > Key: YARN-6000 > URL: https://issues.apache.org/jira/browse/YARN-6000 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0-alpha1 >Reporter: Tao Jie > > We removed public modifier of {{AllocationFileLoaderService.Listener}} in > YARN-4997 since it trigger a findbugs warning. However it breaks Hive code in > {{FairSchedulerShim}}. > {code} > AllocationFileLoaderService allocsLoader = new AllocationFileLoaderService(); > allocsLoader.init(conf); > allocsLoader.setReloadListener(new AllocationFileLoaderService.Listener() > { > @Override > public void onReload(AllocationConfiguration allocs) { > allocConf.set(allocs); > } > }); > try { > allocsLoader.reloadAllocations(); > } catch (Exception ex) { > throw new IOException("Failed to load queue allocations", ex); > } > if (allocConf.get() == null) { > allocConf.set(new AllocationConfiguration(conf)); > } > QueuePlacementPolicy queuePolicy = allocConf.get().getPlacementPolicy(); > if (queuePolicy != null) { > requestedQueue = queuePolicy.assignAppToQueue(requestedQueue, userName); > {code} > As a result we should set the modifier back to public. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4997) Update fair scheduler to use pluggable auth provider
[ https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746936#comment-15746936 ] Tao Jie commented on YARN-4997: --- [~sershe], we have discussed about the modifier of {{interface Listener}} earlier in this patch. We removed public on interface Listener since it got a findbugs warning, and found {{public}} here is not necessary. Since this breaks Hive code, I prefer to add {{public}} back to Listener. > Update fair scheduler to use pluggable auth provider > > > Key: YARN-4997 > URL: https://issues.apache.org/jira/browse/YARN-4997 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.8.0 >Reporter: Daniel Templeton >Assignee: Tao Jie > Fix For: 3.0.0-alpha2 > > Attachments: YARN-4997-001.patch, YARN-4997-002.patch, > YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, > YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, > YARN-4997-009.patch, YARN-4997-010.patch, YARN-4997-011.patch > > > Now that YARN-3100 has made the authorization pluggable, it should be > supported by the fair scheduler. YARN-3100 only updated the capacity > scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4997) Update fair scheduler to use pluggable auth provider
[ https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707795#comment-15707795 ] Tao Jie commented on YARN-4997: --- [~templedf], thanks for your patient review and sorry for inaccurate understanding of your comments. Updated the patch.. > Update fair scheduler to use pluggable auth provider > > > Key: YARN-4997 > URL: https://issues.apache.org/jira/browse/YARN-4997 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.8.0 >Reporter: Daniel Templeton >Assignee: Tao Jie > Attachments: YARN-4997-001.patch, YARN-4997-002.patch, > YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, > YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, > YARN-4997-009.patch, YARN-4997-010.patch, YARN-4997-011.patch > > > Now that YARN-3100 has made the authorization pluggable, it should be > supported by the fair scheduler. YARN-3100 only updated the capacity > scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4997) Update fair scheduler to use pluggable auth provider
[ https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-4997: -- Attachment: YARN-4997-011.patch > Update fair scheduler to use pluggable auth provider > > > Key: YARN-4997 > URL: https://issues.apache.org/jira/browse/YARN-4997 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.8.0 >Reporter: Daniel Templeton >Assignee: Tao Jie > Attachments: YARN-4997-001.patch, YARN-4997-002.patch, > YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, > YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, > YARN-4997-009.patch, YARN-4997-010.patch, YARN-4997-011.patch > > > Now that YARN-3100 has made the authorization pluggable, it should be > supported by the fair scheduler. YARN-3100 only updated the capacity > scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4997) Update fair scheduler to use pluggable auth provider
[ https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15704462#comment-15704462 ] Tao Jie commented on YARN-4997: --- Updated the patch respect to [~templedf]'s comment. The test failure is irrelevant. > Update fair scheduler to use pluggable auth provider > > > Key: YARN-4997 > URL: https://issues.apache.org/jira/browse/YARN-4997 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.8.0 >Reporter: Daniel Templeton >Assignee: Tao Jie > Attachments: YARN-4997-001.patch, YARN-4997-002.patch, > YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, > YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, > YARN-4997-009.patch, YARN-4997-010.patch > > > Now that YARN-3100 has made the authorization pluggable, it should be > supported by the fair scheduler. YARN-3100 only updated the capacity > scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4997) Update fair scheduler to use pluggable auth provider
[ https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-4997: -- Attachment: YARN-4997-010.patch > Update fair scheduler to use pluggable auth provider > > > Key: YARN-4997 > URL: https://issues.apache.org/jira/browse/YARN-4997 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.8.0 >Reporter: Daniel Templeton >Assignee: Tao Jie > Attachments: YARN-4997-001.patch, YARN-4997-002.patch, > YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, > YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, > YARN-4997-009.patch, YARN-4997-010.patch > > > Now that YARN-3100 has made the authorization pluggable, it should be > supported by the fair scheduler. YARN-3100 only updated the capacity > scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5040) CPU Isolation with CGroups triggers kernel panics on Centos 7.1/7.2 when yarn.nodemanager.resource.percentage-physical-cpu-limit < 100
[ https://issues.apache.org/jira/browse/YARN-5040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15688554#comment-15688554 ] Tao Jie commented on YARN-5040: --- We have met the same problem. We set yarn.nodemanager.resource.percentage-physical-cpu-limit=80 and tested both kernel version 2.6.32-642 and 3.10.103 with hadoop-2.7.1 by running Terasort, the kernel crashed. Then we updated kernel version to 4.8.1, the kernel panic didn't happen any more. It seems that this kernel panic is due to kernel cgroup bug, which is fixed in higher kernel version. > CPU Isolation with CGroups triggers kernel panics on Centos 7.1/7.2 when > yarn.nodemanager.resource.percentage-physical-cpu-limit < 100 > -- > > Key: YARN-5040 > URL: https://issues.apache.org/jira/browse/YARN-5040 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0 >Reporter: Sidharta Seethana >Assignee: Varun Vasudev > > /cc [~vvasudev] > We have been running some benchmarks internally with resource isolation > enabled. We have consistently run into kernel panics when running a large job > ( a large pi job, terasort ). These kernel panics wen't away when we set > yarn.nodemanager.resource.percentage-physical-cpu-limit=100 . Anything less > than 100 triggers different behavior in YARN's CPU resource handler which > seems to cause these issues. Looking at the kernel crash dumps, the > backtraces were different - sometimes pointing to java processes, sometimes > not. > Kernel versions used : 3.10.0-229.14.1.el7.x86_64 and > 3.10.0-327.13.1.el7.x86_64 . -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2497) Changes for fair scheduler to support allocate resource respect labels
[ https://issues.apache.org/jira/browse/YARN-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634894#comment-15634894 ] Tao Jie commented on YARN-2497: --- Any updates for this jira? It might be very useful to me. > Changes for fair scheduler to support allocate resource respect labels > -- > > Key: YARN-2497 > URL: https://issues.apache.org/jira/browse/YARN-2497 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Wangda Tan >Assignee: Naganarasimha G R > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"
[ https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15632064#comment-15632064 ] Tao Jie commented on YARN-5720: --- It seems that nodeLabel related commends is not included in YarnCommands.html in branch-2.8. > Update document for "rmadmin -replaceLabelOnNode" > - > > Key: YARN-5720 > URL: https://issues.apache.org/jira/browse/YARN-5720 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Minor > Attachments: YARN-5720-branch-2.8.patch, YARN-5720.001.patch, > YARN-5720.002.patch, YarnCommands.png, nodeLabel.png > > > As mentioned in YARN-4855, document should be updated since commands has > changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"
[ https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5720: -- Attachment: YARN-5720-branch-2.8.patch > Update document for "rmadmin -replaceLabelOnNode" > - > > Key: YARN-5720 > URL: https://issues.apache.org/jira/browse/YARN-5720 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Minor > Attachments: YARN-5720-branch-2.8.patch, YARN-5720.001.patch, > YARN-5720.002.patch, YarnCommands.png, nodeLabel.png > > > As mentioned in YARN-4855, document should be updated since commands has > changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5552) Add Builder methods for common yarn API records
[ https://issues.apache.org/jira/browse/YARN-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5552: -- Attachment: YARN-5552.009.patch > Add Builder methods for common yarn API records > --- > > Key: YARN-5552 > URL: https://issues.apache.org/jira/browse/YARN-5552 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Tao Jie > Attachments: YARN-5552.000.patch, YARN-5552.001.patch, > YARN-5552.002.patch, YARN-5552.003.patch, YARN-5552.004.patch, > YARN-5552.005.patch, YARN-5552.006.patch, YARN-5552.007.patch, > YARN-5552.008.patch, YARN-5552.009.patch > > > Currently yarn API records such as ResourceRequest, AllocateRequest/Respone > as well as AMRMClient.ContainerRequest have multiple constructors / > newInstance methods. This makes it very difficult to add new fields to these > records. > It would probably be better if we had Builder classes for many of these > records, which would make evolution of these records a bit easier. > (suggested by [~kasha]) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5697) Use CliParser to parse options in RMAdminCLI
[ https://issues.apache.org/jira/browse/YARN-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15628123#comment-15628123 ] Tao Jie commented on YARN-5697: --- Hi [~Naganarasimha], I checked test log and it seems that test case failures are due to test environment: {quote} testNonExistentUser(org.apache.hadoop.yarn.client.TestGetGroups) Time elapsed: 0.004 sec <<< ERROR! java.net.UnknownHostException: Invalid host name: local host is: (unknown); destination host is: "7ed7e992eec3":8033; java.net.UnknownHostException; For more details see: http://wiki.apache.org/hadoop/UnknownHost {quote} Also I run failed test cases on my local environment, all of them are fine. > Use CliParser to parse options in RMAdminCLI > > > Key: YARN-5697 > URL: https://issues.apache.org/jira/browse/YARN-5697 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie > Attachments: YARN-5697.001.patch, YARN-5697.002.patch, > YARN-5697.003.patch, YARN-5697.004.patch, YARN-5697.005-branch-2.8.patch, > YARN-5697.005.patch > > > As discussed in YARN-4855, it is better to use CliParser rather than args to > parse command line options in RMAdminCli. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5552) Add Builder methods for common yarn API records
[ https://issues.apache.org/jira/browse/YARN-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5552: -- Attachment: YARN-5552.008.patch > Add Builder methods for common yarn API records > --- > > Key: YARN-5552 > URL: https://issues.apache.org/jira/browse/YARN-5552 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Tao Jie > Attachments: YARN-5552.000.patch, YARN-5552.001.patch, > YARN-5552.002.patch, YARN-5552.003.patch, YARN-5552.004.patch, > YARN-5552.005.patch, YARN-5552.006.patch, YARN-5552.007.patch, > YARN-5552.008.patch > > > Currently yarn API records such as ResourceRequest, AllocateRequest/Respone > as well as AMRMClient.ContainerRequest have multiple constructors / > newInstance methods. This makes it very difficult to add new fields to these > records. > It would probably be better if we had Builder classes for many of these > records, which would make evolution of these records a bit easier. > (suggested by [~kasha]) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5697) Use CliParser to parse options in RMAdminCLI
[ https://issues.apache.org/jira/browse/YARN-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5697: -- Attachment: YARN-5697.005-branch-2.8.patch > Use CliParser to parse options in RMAdminCLI > > > Key: YARN-5697 > URL: https://issues.apache.org/jira/browse/YARN-5697 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie > Attachments: YARN-5697.001.patch, YARN-5697.002.patch, > YARN-5697.003.patch, YARN-5697.004.patch, YARN-5697.005-branch-2.8.patch, > YARN-5697.005.patch > > > As discussed in YARN-4855, it is better to use CliParser rather than args to > parse command line options in RMAdminCli. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5552) Add Builder methods for common yarn API records
[ https://issues.apache.org/jira/browse/YARN-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5552: -- Attachment: YARN-5552.007.patch > Add Builder methods for common yarn API records > --- > > Key: YARN-5552 > URL: https://issues.apache.org/jira/browse/YARN-5552 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Tao Jie > Attachments: YARN-5552.000.patch, YARN-5552.001.patch, > YARN-5552.002.patch, YARN-5552.003.patch, YARN-5552.004.patch, > YARN-5552.005.patch, YARN-5552.006.patch, YARN-5552.007.patch > > > Currently yarn API records such as ResourceRequest, AllocateRequest/Respone > as well as AMRMClient.ContainerRequest have multiple constructors / > newInstance methods. This makes it very difficult to add new fields to these > records. > It would probably be better if we had Builder classes for many of these > records, which would make evolution of these records a bit easier. > (suggested by [~kasha]) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4997) Update fair scheduler to use pluggable auth provider
[ https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-4997: -- Attachment: (was: YARN-4997-009.patch) > Update fair scheduler to use pluggable auth provider > > > Key: YARN-4997 > URL: https://issues.apache.org/jira/browse/YARN-4997 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.8.0 >Reporter: Daniel Templeton >Assignee: Tao Jie > Attachments: YARN-4997-001.patch, YARN-4997-002.patch, > YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, > YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, > YARN-4997-009.patch > > > Now that YARN-3100 has made the authorization pluggable, it should be > supported by the fair scheduler. YARN-3100 only updated the capacity > scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4997) Update fair scheduler to use pluggable auth provider
[ https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-4997: -- Attachment: YARN-4997-009.patch > Update fair scheduler to use pluggable auth provider > > > Key: YARN-4997 > URL: https://issues.apache.org/jira/browse/YARN-4997 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.8.0 >Reporter: Daniel Templeton >Assignee: Tao Jie > Attachments: YARN-4997-001.patch, YARN-4997-002.patch, > YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, > YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, > YARN-4997-009.patch, YARN-4997-009.patch > > > Now that YARN-3100 has made the authorization pluggable, it should be > supported by the fair scheduler. YARN-3100 only updated the capacity > scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5552) Add Builder methods for common yarn API records
[ https://issues.apache.org/jira/browse/YARN-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5552: -- Attachment: YARN-5552.006.patch > Add Builder methods for common yarn API records > --- > > Key: YARN-5552 > URL: https://issues.apache.org/jira/browse/YARN-5552 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Tao Jie > Attachments: YARN-5552.000.patch, YARN-5552.001.patch, > YARN-5552.002.patch, YARN-5552.003.patch, YARN-5552.004.patch, > YARN-5552.005.patch, YARN-5552.006.patch > > > Currently yarn API records such as ResourceRequest, AllocateRequest/Respone > as well as AMRMClient.ContainerRequest have multiple constructors / > newInstance methods. This makes it very difficult to add new fields to these > records. > It would probably be better if we had Builder classes for many of these > records, which would make evolution of these records a bit easier. > (suggested by [~kasha]) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4997) Update fair scheduler to use pluggable auth provider
[ https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15624840#comment-15624840 ] Tao Jie commented on YARN-4997: --- Hi [~kasha], I rebased the patch for review. For the semantics of {{setPermission}}, I checked code in RANGER, where {{RangerYarnAuthorizer}} extends {{YarnAuthorizationProvider}} and override method {{setPermission}}. As a result, we should keep this method as it used to be. > Update fair scheduler to use pluggable auth provider > > > Key: YARN-4997 > URL: https://issues.apache.org/jira/browse/YARN-4997 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.8.0 >Reporter: Daniel Templeton >Assignee: Tao Jie > Attachments: YARN-4997-001.patch, YARN-4997-002.patch, > YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, > YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, > YARN-4997-009.patch > > > Now that YARN-3100 has made the authorization pluggable, it should be > supported by the fair scheduler. YARN-3100 only updated the capacity > scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4997) Update fair scheduler to use pluggable auth provider
[ https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-4997: -- Attachment: YARN-4997-009.patch > Update fair scheduler to use pluggable auth provider > > > Key: YARN-4997 > URL: https://issues.apache.org/jira/browse/YARN-4997 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.8.0 >Reporter: Daniel Templeton >Assignee: Tao Jie > Attachments: YARN-4997-001.patch, YARN-4997-002.patch, > YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, > YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, > YARN-4997-009.patch > > > Now that YARN-3100 has made the authorization pluggable, it should be > supported by the fair scheduler. YARN-3100 only updated the capacity > scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5697) Use CliParser to parse options in RMAdminCLI
[ https://issues.apache.org/jira/browse/YARN-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5697: -- Attachment: YARN-5697.005.patch > Use CliParser to parse options in RMAdminCLI > > > Key: YARN-5697 > URL: https://issues.apache.org/jira/browse/YARN-5697 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie > Attachments: YARN-5697.001.patch, YARN-5697.002.patch, > YARN-5697.003.patch, YARN-5697.004.patch, YARN-5697.005.patch > > > As discussed in YARN-4855, it is better to use CliParser rather than args to > parse command line options in RMAdminCli. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"
[ https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15624599#comment-15624599 ] Tao Jie commented on YARN-5720: --- Updated the document respect to discussion in YARN-5697, and changed the position of the option {{-failOnUnknownNodes}} > Update document for "rmadmin -replaceLabelOnNode" > - > > Key: YARN-5720 > URL: https://issues.apache.org/jira/browse/YARN-5720 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Minor > Attachments: YARN-5720.001.patch, YARN-5720.002.patch, > YarnCommands.png, nodeLabel.png > > > As mentioned in YARN-4855, document should be updated since commands has > changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"
[ https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5720: -- Attachment: YarnCommands.png > Update document for "rmadmin -replaceLabelOnNode" > - > > Key: YARN-5720 > URL: https://issues.apache.org/jira/browse/YARN-5720 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Minor > Attachments: YARN-5720.001.patch, YARN-5720.002.patch, > YarnCommands.png, nodeLabel.png > > > As mentioned in YARN-4855, document should be updated since commands has > changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"
[ https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5720: -- Attachment: nodeLabel.png > Update document for "rmadmin -replaceLabelOnNode" > - > > Key: YARN-5720 > URL: https://issues.apache.org/jira/browse/YARN-5720 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Minor > Attachments: YARN-5720.001.patch, YARN-5720.002.patch, nodeLabel.png > > > As mentioned in YARN-4855, document should be updated since commands has > changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"
[ https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5720: -- Attachment: (was: YarnCommands.png) > Update document for "rmadmin -replaceLabelOnNode" > - > > Key: YARN-5720 > URL: https://issues.apache.org/jira/browse/YARN-5720 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Minor > Attachments: YARN-5720.001.patch, YARN-5720.002.patch > > > As mentioned in YARN-4855, document should be updated since commands has > changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"
[ https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5720: -- Attachment: YARN-5720.002.patch > Update document for "rmadmin -replaceLabelOnNode" > - > > Key: YARN-5720 > URL: https://issues.apache.org/jira/browse/YARN-5720 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Minor > Attachments: YARN-5720.001.patch, YARN-5720.002.patch > > > As mentioned in YARN-4855, document should be updated since commands has > changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"
[ https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5720: -- Attachment: (was: nodeLabel.png) > Update document for "rmadmin -replaceLabelOnNode" > - > > Key: YARN-5720 > URL: https://issues.apache.org/jira/browse/YARN-5720 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Minor > Attachments: YARN-5720.001.patch, YARN-5720.002.patch > > > As mentioned in YARN-4855, document should be updated since commands has > changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5697) Use CliParser to parse options in RMAdminCLI
[ https://issues.apache.org/jira/browse/YARN-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5697: -- Attachment: YARN-5697.004.patch > Use CliParser to parse options in RMAdminCLI > > > Key: YARN-5697 > URL: https://issues.apache.org/jira/browse/YARN-5697 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie > Attachments: YARN-5697.001.patch, YARN-5697.002.patch, > YARN-5697.003.patch, YARN-5697.004.patch > > > As discussed in YARN-4855, it is better to use CliParser rather than args to > parse command line options in RMAdminCli. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5697) Use CliParser to parse options in RMAdminCLI
[ https://issues.apache.org/jira/browse/YARN-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5697: -- Attachment: (was: YARN-5697.004.patch) > Use CliParser to parse options in RMAdminCLI > > > Key: YARN-5697 > URL: https://issues.apache.org/jira/browse/YARN-5697 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie > Attachments: YARN-5697.001.patch, YARN-5697.002.patch, > YARN-5697.003.patch, YARN-5697.004.patch > > > As discussed in YARN-4855, it is better to use CliParser rather than args to > parse command line options in RMAdminCli. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5697) Use CliParser to parse options in RMAdminCLI
[ https://issues.apache.org/jira/browse/YARN-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15624544#comment-15624544 ] Tao Jie commented on YARN-5697: --- Updated this patch as discussed above. And removing {{-directlyAccessNodeLabelStore}} is not included in this patch. [~Naganarasimha], mind giving a review on this patch? > Use CliParser to parse options in RMAdminCLI > > > Key: YARN-5697 > URL: https://issues.apache.org/jira/browse/YARN-5697 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie > Attachments: YARN-5697.001.patch, YARN-5697.002.patch, > YARN-5697.003.patch, YARN-5697.004.patch > > > As discussed in YARN-4855, it is better to use CliParser rather than args to > parse command line options in RMAdminCli. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5697) Use CliParser to parse options in RMAdminCLI
[ https://issues.apache.org/jira/browse/YARN-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5697: -- Attachment: YARN-5697.004.patch > Use CliParser to parse options in RMAdminCLI > > > Key: YARN-5697 > URL: https://issues.apache.org/jira/browse/YARN-5697 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie > Attachments: YARN-5697.001.patch, YARN-5697.002.patch, > YARN-5697.003.patch, YARN-5697.004.patch > > > As discussed in YARN-4855, it is better to use CliParser rather than args to > parse command line options in RMAdminCli. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5552) Add Builder methods for common yarn API records
[ https://issues.apache.org/jira/browse/YARN-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15575292#comment-15575292 ] Tao Jie commented on YARN-5552: --- Updated the patch according to [~leftnoteasy]'s suggestion. [~asuresh], [~kasha], [~leftnoteasy] would you mind reviewing the latest patch again? > Add Builder methods for common yarn API records > --- > > Key: YARN-5552 > URL: https://issues.apache.org/jira/browse/YARN-5552 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Tao Jie > Attachments: YARN-5552.000.patch, YARN-5552.001.patch, > YARN-5552.002.patch, YARN-5552.003.patch, YARN-5552.004.patch, > YARN-5552.005.patch > > > Currently yarn API records such as ResourceRequest, AllocateRequest/Respone > as well as AMRMClient.ContainerRequest have multiple constructors / > newInstance methods. This makes it very difficult to add new fields to these > records. > It would probably be better if we had Builder classes for many of these > records, which would make evolution of these records a bit easier. > (suggested by [~kasha]) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5552) Add Builder methods for common yarn API records
[ https://issues.apache.org/jira/browse/YARN-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5552: -- Attachment: YARN-5552.005.patch > Add Builder methods for common yarn API records > --- > > Key: YARN-5552 > URL: https://issues.apache.org/jira/browse/YARN-5552 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Tao Jie > Attachments: YARN-5552.000.patch, YARN-5552.001.patch, > YARN-5552.002.patch, YARN-5552.003.patch, YARN-5552.004.patch, > YARN-5552.005.patch > > > Currently yarn API records such as ResourceRequest, AllocateRequest/Respone > as well as AMRMClient.ContainerRequest have multiple constructors / > newInstance methods. This makes it very difficult to add new fields to these > records. > It would probably be better if we had Builder classes for many of these > records, which would make evolution of these records a bit easier. > (suggested by [~kasha]) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5552) Add Builder methods for common yarn API records
[ https://issues.apache.org/jira/browse/YARN-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5552: -- Attachment: YARN-5552.004.patch > Add Builder methods for common yarn API records > --- > > Key: YARN-5552 > URL: https://issues.apache.org/jira/browse/YARN-5552 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Tao Jie > Attachments: YARN-5552.000.patch, YARN-5552.001.patch, > YARN-5552.002.patch, YARN-5552.003.patch, YARN-5552.004.patch > > > Currently yarn API records such as ResourceRequest, AllocateRequest/Respone > as well as AMRMClient.ContainerRequest have multiple constructors / > newInstance methods. This makes it very difficult to add new fields to these > records. > It would probably be better if we had Builder classes for many of these > records, which would make evolution of these records a bit easier. > (suggested by [~kasha]) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"
[ https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570652#comment-15570652 ] Tao Jie commented on YARN-5720: --- Attached picture of generated html, which is easier to review. > Update document for "rmadmin -replaceLabelOnNode" > - > > Key: YARN-5720 > URL: https://issues.apache.org/jira/browse/YARN-5720 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Minor > Attachments: YARN-5720.001.patch, YarnCommands.png, nodeLabel.png > > > As mentioned in YARN-4855, document should be updated since commands has > changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"
[ https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5720: -- Attachment: YarnCommands.png nodeLabel.png > Update document for "rmadmin -replaceLabelOnNode" > - > > Key: YARN-5720 > URL: https://issues.apache.org/jira/browse/YARN-5720 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Minor > Attachments: YARN-5720.001.patch, YarnCommands.png, nodeLabel.png > > > As mentioned in YARN-4855, document should be updated since commands has > changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"
[ https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5720: -- Attachment: YarnCommands.html NodeLabel.html YARN-5720.001.patch > Update document for "rmadmin -replaceLabelOnNode" > - > > Key: YARN-5720 > URL: https://issues.apache.org/jira/browse/YARN-5720 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Minor > Attachments: YARN-5720.001.patch > > > As mentioned in YARN-4855, document should be updated since commands has > changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"
[ https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5720: -- Attachment: (was: YARN-5720.001.patch) > Update document for "rmadmin -replaceLabelOnNode" > - > > Key: YARN-5720 > URL: https://issues.apache.org/jira/browse/YARN-5720 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Minor > Attachments: YARN-5720.001.patch > > > As mentioned in YARN-4855, document should be updated since commands has > changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"
[ https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5720: -- Attachment: (was: YarnCommands.html) > Update document for "rmadmin -replaceLabelOnNode" > - > > Key: YARN-5720 > URL: https://issues.apache.org/jira/browse/YARN-5720 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Minor > Attachments: YARN-5720.001.patch > > > As mentioned in YARN-4855, document should be updated since commands has > changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"
[ https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5720: -- Attachment: (was: NodeLabel.html) > Update document for "rmadmin -replaceLabelOnNode" > - > > Key: YARN-5720 > URL: https://issues.apache.org/jira/browse/YARN-5720 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Minor > Attachments: YARN-5720.001.patch > > > As mentioned in YARN-4855, document should be updated since commands has > changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5697) Use CliParser to parse options in RMAdminCLI
[ https://issues.apache.org/jira/browse/YARN-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570580#comment-15570580 ] Tao Jie commented on YARN-5697: --- Thank you [~Naganarasimha], I tried more ideal logic in earlier patch, but failed in testcase:TestRMAdminCLI#directlyAccessNodeLabelStore: {code} // change the sequence of "-directlyAccessNodeLabelStore" and labels, // should not matter args = new String[] { "-addToClusterNodeLabels", "-directlyAccessNodeLabelStore", "x,y" }; assertEquals(0, rmAdminCLI.run(args)); assertTrue(dummyNodeLabelsManager.getClusterNodeLabelNames().containsAll( ImmutableSet.of("x", "y"))); {code} It seems that we don't care about the position of {{-directlyAccessNodeLabelStore}} in command line currently. Although {{-directlyAccessNodeLabelStore}} is marked as deprecated, this option still leads to different code path currently: {code} if (directlyAccessNodeLabelStore) { getNodeLabelManagerInstance(getConf()).replaceLabelsOnNode(map); } else { ResourceManagerAdministrationProtocol adminProtocol = createAdminProtocol(); ReplaceLabelsOnNodeRequest request = ReplaceLabelsOnNodeRequest.newInstance(map); request.setFailOnUnknownNodes(failOnUnknownNodes); adminProtocol.replaceLabelsOnNode(request); } {code} Should we just remove the logic about {{-directlyAccessNodeLabelStore}} in this patch? To make it clear, 1, We should restrict command line format ({{rmadmin -addToClusterNodeLabels -directlyAccessNodeLabelStore x,y}} will no longer be OK, also {{rmadmin -replaceLabelsOnNode -failOnUnknownNodes node1=label1}} should be {{rmadmin -replaceLabelsOnNode node1=label1 -failOnUnknownNodes}}). 2, We should remove code about {{-directlyAccessNodeLabelStore}} in this patch. 3, We should modify document and remove {{-directlyAccessNodeLabelStore}}. Agree? > Use CliParser to parse options in RMAdminCLI > > > Key: YARN-5697 > URL: https://issues.apache.org/jira/browse/YARN-5697 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie > Fix For: 2.8.0 > > Attachments: YARN-5697.001.patch, YARN-5697.002.patch, > YARN-5697.003.patch > > > As discussed in YARN-4855, it is better to use CliParser rather than args to > parse command line options in RMAdminCli. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5697) Use CliParser to parse options in RMAdminCLI
[ https://issues.apache.org/jira/browse/YARN-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565416#comment-15565416 ] Tao Jie commented on YARN-5697: --- Thank you [~Naganarasimha] for your comments. As I mentioned before, I tried {{cliParser.getOptionValues}} in the earlier patch but found incompatibility. Both {{rmadmin -replaceLabelsOnNode node1=label1 -directlyAccessNodeLabelStore}} and {{rmadmin -replaceLabelsOnNode -directlyAccessNodeLabelStore node1=label1}} go well in existing logic. When I use {{cliParser.getOptionValues}} to parse the latter command, {{node1=label1}} is parsed as optionValue of {{-directlyAccessNodeLabelStore}} rather than {{-replaceLabelsOnNode}}. Actually {{-directlyAccessNodeLabelStore}} is valid in any position. As a result I use {{cliParser.getArgs()}}, which ignores args sequence but keep compatible with existing logic. > Use CliParser to parse options in RMAdminCLI > > > Key: YARN-5697 > URL: https://issues.apache.org/jira/browse/YARN-5697 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie > Fix For: 2.8.0 > > Attachments: YARN-5697.001.patch, YARN-5697.002.patch, > YARN-5697.003.patch > > > As discussed in YARN-4855, it is better to use CliParser rather than args to > parse command line options in RMAdminCli. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"
[ https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5720: -- Attachment: YARN-5720.001.patch > Update document for "rmadmin -replaceLabelOnNode" > - > > Key: YARN-5720 > URL: https://issues.apache.org/jira/browse/YARN-5720 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Minor > Attachments: YARN-5720.001.patch > > > As mentioned in YARN-4855, document should be updated since commands has > changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"
[ https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-5720: -- Affects Version/s: 2.8.0 > Update document for "rmadmin -replaceLabelOnNode" > - > > Key: YARN-5720 > URL: https://issues.apache.org/jira/browse/YARN-5720 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Minor > Attachments: YARN-5720.001.patch > > > As mentioned in YARN-4855, document should be updated since commands has > changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"
Tao Jie created YARN-5720: - Summary: Update document for "rmadmin -replaceLabelOnNode" Key: YARN-5720 URL: https://issues.apache.org/jira/browse/YARN-5720 Project: Hadoop YARN Issue Type: Improvement Reporter: Tao Jie Assignee: Tao Jie Priority: Minor As mentioned in YARN-4855, document should be updated since commands has changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org