[jira] [Commented] (YARN-6481) Yarn top shows negative container number in FS

2017-05-01 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15991952#comment-15991952
 ] 

Tao Jie commented on YARN-6481:
---

Thank you [~yufeigu], I added check in {{TestFairScheduler.testQueueInfo()}} 
and ensure metrics here really work.

> Yarn top shows negative container number in FS
> --
>
> Key: YARN-6481
> URL: https://issues.apache.org/jira/browse/YARN-6481
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.9.0
>Reporter: Yufei Gu
>Assignee: Tao Jie
>  Labels: newbie
> Attachments: YARN-6481.001.patch, YARN-6481.002.patch
>
>
> yarn top shows negative container numbers and they didn't change even they 
> were supposed to.
> {code}
> NodeManager(s): 2 total, 2 active, 0 unhealthy, 0 decommissioned, 0 lost, 0 
> rebooted
> Queue(s) Applications: 0 running, 12 submitted, 0 pending, 12 completed, 0 
> killed, 0 failed
> Queue(s) Mem(GB): 0 available, 0 allocated, 0 pending, 0 reserved
> Queue(s) VCores: 0 available, 0 allocated, 0 pending, 0 reserved
> Queue(s) Containers: -2 allocated, -2 pending, -2 reserved
>   APPLICATIONID USER TYPE  QUEUE   #CONT  
> #RCONT  VCORES RVC
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6481) Yarn top shows negative container number in FS

2017-05-01 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-6481:
--
Attachment: (was: YARN-6481.002.patch)

> Yarn top shows negative container number in FS
> --
>
> Key: YARN-6481
> URL: https://issues.apache.org/jira/browse/YARN-6481
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.9.0
>Reporter: Yufei Gu
>Assignee: Tao Jie
>  Labels: newbie
> Attachments: YARN-6481.001.patch, YARN-6481.002.patch
>
>
> yarn top shows negative container numbers and they didn't change even they 
> were supposed to.
> {code}
> NodeManager(s): 2 total, 2 active, 0 unhealthy, 0 decommissioned, 0 lost, 0 
> rebooted
> Queue(s) Applications: 0 running, 12 submitted, 0 pending, 12 completed, 0 
> killed, 0 failed
> Queue(s) Mem(GB): 0 available, 0 allocated, 0 pending, 0 reserved
> Queue(s) VCores: 0 available, 0 allocated, 0 pending, 0 reserved
> Queue(s) Containers: -2 allocated, -2 pending, -2 reserved
>   APPLICATIONID USER TYPE  QUEUE   #CONT  
> #RCONT  VCORES RVC
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6481) Yarn top shows negative container number in FS

2017-05-01 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-6481:
--
Attachment: YARN-6481.002.patch

> Yarn top shows negative container number in FS
> --
>
> Key: YARN-6481
> URL: https://issues.apache.org/jira/browse/YARN-6481
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.9.0
>Reporter: Yufei Gu
>Assignee: Tao Jie
>  Labels: newbie
> Attachments: YARN-6481.001.patch, YARN-6481.002.patch
>
>
> yarn top shows negative container numbers and they didn't change even they 
> were supposed to.
> {code}
> NodeManager(s): 2 total, 2 active, 0 unhealthy, 0 decommissioned, 0 lost, 0 
> rebooted
> Queue(s) Applications: 0 running, 12 submitted, 0 pending, 12 completed, 0 
> killed, 0 failed
> Queue(s) Mem(GB): 0 available, 0 allocated, 0 pending, 0 reserved
> Queue(s) VCores: 0 available, 0 allocated, 0 pending, 0 reserved
> Queue(s) Containers: -2 allocated, -2 pending, -2 reserved
>   APPLICATIONID USER TYPE  QUEUE   #CONT  
> #RCONT  VCORES RVC
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6481) Yarn top shows negative container number in FS

2017-05-01 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-6481:
--
Attachment: YARN-6481.002.patch

> Yarn top shows negative container number in FS
> --
>
> Key: YARN-6481
> URL: https://issues.apache.org/jira/browse/YARN-6481
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.9.0
>Reporter: Yufei Gu
>Assignee: Tao Jie
>  Labels: newbie
> Attachments: YARN-6481.001.patch, YARN-6481.002.patch
>
>
> yarn top shows negative container numbers and they didn't change even they 
> were supposed to.
> {code}
> NodeManager(s): 2 total, 2 active, 0 unhealthy, 0 decommissioned, 0 lost, 0 
> rebooted
> Queue(s) Applications: 0 running, 12 submitted, 0 pending, 12 completed, 0 
> killed, 0 failed
> Queue(s) Mem(GB): 0 available, 0 allocated, 0 pending, 0 reserved
> Queue(s) VCores: 0 available, 0 allocated, 0 pending, 0 reserved
> Queue(s) Containers: -2 allocated, -2 pending, -2 reserved
>   APPLICATIONID USER TYPE  QUEUE   #CONT  
> #RCONT  VCORES RVC
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6481) Yarn top shows negative container number in FS

2017-04-28 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-6481:
--
Attachment: YARN-6481.001.patch

> Yarn top shows negative container number in FS
> --
>
> Key: YARN-6481
> URL: https://issues.apache.org/jira/browse/YARN-6481
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.9.0
>Reporter: Yufei Gu
>  Labels: newbie
> Attachments: YARN-6481.001.patch
>
>
> yarn top shows negative container numbers and they didn't change even they 
> were supposed to.
> {code}
> NodeManager(s): 2 total, 2 active, 0 unhealthy, 0 decommissioned, 0 lost, 0 
> rebooted
> Queue(s) Applications: 0 running, 12 submitted, 0 pending, 12 completed, 0 
> killed, 0 failed
> Queue(s) Mem(GB): 0 available, 0 allocated, 0 pending, 0 reserved
> Queue(s) VCores: 0 available, 0 allocated, 0 pending, 0 reserved
> Queue(s) Containers: -2 allocated, -2 pending, -2 reserved
>   APPLICATIONID USER TYPE  QUEUE   #CONT  
> #RCONT  VCORES RVC
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6481) Yarn top shows negative container number in FS

2017-04-28 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15988861#comment-15988861
 ] 

Tao Jie commented on YARN-6481:
---

When generating QueueStatistics instance in FSQueue,  metrics about containers 
are missed.
[~yufeigu][~kasha], I uploaded a patch, would you give it a review?

> Yarn top shows negative container number in FS
> --
>
> Key: YARN-6481
> URL: https://issues.apache.org/jira/browse/YARN-6481
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.9.0
>Reporter: Yufei Gu
>  Labels: newbie
> Attachments: YARN-6481.001.patch
>
>
> yarn top shows negative container numbers and they didn't change even they 
> were supposed to.
> {code}
> NodeManager(s): 2 total, 2 active, 0 unhealthy, 0 decommissioned, 0 lost, 0 
> rebooted
> Queue(s) Applications: 0 running, 12 submitted, 0 pending, 12 completed, 0 
> killed, 0 failed
> Queue(s) Mem(GB): 0 available, 0 allocated, 0 pending, 0 reserved
> Queue(s) VCores: 0 available, 0 allocated, 0 pending, 0 reserved
> Queue(s) Containers: -2 allocated, -2 pending, -2 reserved
>   APPLICATIONID USER TYPE  QUEUE   #CONT  
> #RCONT  VCORES RVC
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6380) FSAppAttempt keeps redundant copy of the queue

2017-03-23 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15939735#comment-15939735
 ] 

Tao Jie commented on YARN-6380:
---

[~templedf],  in current FSAppAttempt constructor:
{code}
 public FSAppAttempt(FairScheduler scheduler,
  ApplicationAttemptId applicationAttemptId, String user, FSLeafQueue queue,
  ActiveUsersManager activeUsersManager, RMContext rmContext) {
super(applicationAttemptId, user, queue, activeUsersManager, rmContext);

this.scheduler = scheduler;
this.fsQueue = queue;
this.startTime = scheduler.getClock().getTime();
this.lastTimeAtFairShare = this.startTime;
this.appPriority = Priority.newInstance(1);
this.resourceWeights = new ResourceWeights();
  }
{code}
It seems to me that we create another reference of queue rather than copy of 
the queue. So I think both {{SchedulerApplicationAttempt#queue}} and 
{{SchedulerApplicationAttempt#queue}} indicates to the same object. It ok to 
avoid redundant field in FSAppAttempt, but we should convert Queue to FsQueue 
when using it.

> FSAppAttempt keeps redundant copy of the queue
> --
>
> Key: YARN-6380
> URL: https://issues.apache.org/jira/browse/YARN-6380
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 3.0.0-alpha2
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
> Attachments: YARN-6380.001.patch
>
>
> The {{FSAppAttempt}} class defines its own {{fsQueue}} variable that is a 
> second copy of the {{SchedulerApplicationAttempt}}'s {{queue}} variable.  
> Aside from being redundant, it's also a bug, because when moving 
> applications, we only update the {{SchedulerApplicationAttempt}}'s {{queue}}, 
> not the {{FSAppAttempt}}'s {{fsQueue}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6320) FairScheduler:Identifying apps to assign in updateThread

2017-03-13 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15923469#comment-15923469
 ] 

Tao Jie commented on YARN-6320:
---

[~kasha] thank you for sharing your comment.
I try to do like that, we compute resource deficit for every app considering 
policies in updateThread. In the list of apps maintained, we sort apps by the 
amount of resource deficit.  And generally we try to assign container to apps 
with the most resource deficit in each nodeUpdate.

> FairScheduler:Identifying apps to assign in updateThread
> 
>
> Key: YARN-6320
> URL: https://issues.apache.org/jira/browse/YARN-6320
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Tao Jie
>
> In FairScheduler today, we have 1)UpdateThread that update queue/app status, 
> fairshare, starvation info, 2)nodeUpdate triggered by NM heartbeat that do 
> the scheduling. When we handle one nodeUpdate, we will top-down from the root 
> queue to the leafqueues and find the most needy application to allocate 
> container according to queue's fairshare. Also we should sort children at 
> each hierarchy level.
> My thought is that we have a global sorted {{candidateAppList}} which keeps 
> apps need to assign, and move the logic that "find app that should allocate 
> resource to" from nodeUpdate to UpdateThread. In UpdateThread, we find 
> candidate apps to assign and put them into {{candidateAppList}}. In 
> nodeUpdate, we consume the list and allocate containers to apps. 
> As far as I see, we can have 3 benifits:
> 1, nodeUpdate() is invoked much more frequently than update() in 
> UpdateThread, especially in a large cluster. As a result we can reduce much 
> unnecessary sorting.
> 2, It will have better coordination with YARN-5829, we can indicate apps to 
> assign more directly rather than let nodes find the best apps to assign.
> 3, It seems to be easier to introduce scheduling restricts such as nodelabel, 
> affinity/anti-affinity into FS, since we can pre-allocate containers 
> asynchronously.
> [~kasha], [~templedf], [~yufeigu] like to hear your thoughts.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6320) FairScheduler:Identifying apps to assign in updateThread

2017-03-10 Thread Tao Jie (JIRA)
Tao Jie created YARN-6320:
-

 Summary: FairScheduler:Identifying apps to assign in updateThread
 Key: YARN-6320
 URL: https://issues.apache.org/jira/browse/YARN-6320
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Tao Jie


In FairScheduler today, we have 1)UpdateThread that update queue/app status, 
fairshare, starvation info, 2)nodeUpdate triggered by NM heartbeat that do the 
scheduling. When we handle one nodeUpdate, we will top-down from the root queue 
to the leafqueues and find the most needy application to allocate container 
according to queue's fairshare. Also we should sort children at each hierarchy 
level.
My thought is that we have a global sorted {{candidateAppList}} which keeps 
apps need to assign, and move the logic that "find app that should allocate 
resource to" from nodeUpdate to UpdateThread. In UpdateThread, we find 
candidate apps to assign and put them into {{candidateAppList}}. In nodeUpdate, 
we consume the list and allocate containers to apps. 
As far as I see, we can have 3 benifits:
1, nodeUpdate() is invoked much more frequently than update() in UpdateThread, 
especially in a large cluster. As a result we can reduce much unnecessary 
sorting.
2, It will have better coordination with YARN-5829, we can indicate apps to 
assign more directly rather than let nodes find the best apps to assign.
3, It seems to be easier to introduce scheduling restricts such as nodelabel, 
affinity/anti-affinity into FS, since we can pre-allocate containers 
asynchronously.
[~kasha], [~templedf], [~yufeigu] like to hear your thoughts.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5829) FS preemption should reserve a node before considering containers on it for preemption

2017-03-09 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904584#comment-15904584
 ] 

Tao Jie commented on YARN-5829:
---

Thank you [~miklos.szeg...@cloudera.com] for sharing your thought.
1, It is easy to confuse the reservation we are taking about with the current 
reservation mechanism in scheduler. IIRC, the purpose of current reservation is 
to prevent starvation of request with large resource. And our reservation here 
is to assign container on node to one exact application.
2, I feel both OK about 1)reuse/extend current reservation mechanism or 2)add 
another logic to handle the reservation for preemption.  If is 2), it's better 
to find another name to avoid naming confusion.
3,
{quote}
2. We also need to be careful with prioritizing reservations. For example how 
it works now is that a reservation takes priority before any other request.
What happens, if I have a preemption from a lower priority request but there is 
a demand from a higher priority application?
{quote}
In my opinion, the reservation for preemption should have higher priority than 
current reservation in allocation. If starved application that triggers 
preempting is not satisfied as soon as possible, it will still in starvation 
and try to preempt more containers. However a normal application has 
reservation container on nodes would wait for a while since the resource is 
allocated to another starved application, it makes sense that application would 
get higher priority when itself becomes a starved application. 

> FS preemption should reserve a node before considering containers on it for 
> preemption
> --
>
> Key: YARN-5829
> URL: https://issues.apache.org/jira/browse/YARN-5829
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Karthik Kambatla
>Assignee: Miklos Szegedi
>
> FS preemption evaluates nodes for preemption, and subsequently preempts 
> identified containers. If this node is not reserved for a specific 
> application, any other application could be allocated resources on this node. 
> Reserving the node for the starved application before preempting containers 
> would help avoid this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6301) Fair scheduler docs should explain the meaning of setting a queue's weight to zero

2017-03-09 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904277#comment-15904277
 ] 

Tao Jie commented on YARN-6301:
---

Thank you [~templedf]. It is almost clear to me now. One thing I'd like to make 
it clear that "ad hoc queue" only works among its sibling queues. If one queue 
under another parent-queue has demand for resource, the "ad hoc queue" can 
still have resource due to fairshare of its parent-queue. 
If I am wrong, please correct me.

> Fair scheduler docs should explain the meaning of setting a queue's weight to 
> zero
> --
>
> Key: YARN-6301
> URL: https://issues.apache.org/jira/browse/YARN-6301
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.0.0-alpha2
>Reporter: Daniel Templeton
>Assignee: Tao Jie
>  Labels: docs
> Attachments: YARN-6301.001.patch, YARN-6301.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6301) Fair scheduler docs should explain the meaning of setting a queue's weight to zero

2017-03-09 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-6301:
--
Attachment: YARN-6301.002.patch

> Fair scheduler docs should explain the meaning of setting a queue's weight to 
> zero
> --
>
> Key: YARN-6301
> URL: https://issues.apache.org/jira/browse/YARN-6301
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.0.0-alpha2
>Reporter: Daniel Templeton
>Assignee: Tao Jie
>  Labels: docs
> Attachments: YARN-6301.001.patch, YARN-6301.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6246) Identifying starved apps does not need the scheduler writelock

2017-03-09 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904237#comment-15904237
 ] 

Tao Jie commented on YARN-6246:
---

Thank you [~kasha] for working on this. 
It seems to me that checking starvation is only works on leafqueue, can we just 
go through leafqueues by {{queueMgr.getLeafQueues()}} rather than a top-down 
approach? I hope it could be more efficient.

> Identifying starved apps does not need the scheduler writelock
> --
>
> Key: YARN-6246
> URL: https://issues.apache.org/jira/browse/YARN-6246
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Affects Versions: 2.9.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: YARN-6246.001.patch
>
>
> Currently, the starvation checks are done holding the scheduler writelock. We 
> are probably better of doing this outside. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6301) Fair scheduler docs should explain the meaning of setting a queue's weight to zero

2017-03-09 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902659#comment-15902659
 ] 

Tao Jie commented on YARN-6301:
---

Attached a patch and improved the docs in FairScheduler.md

> Fair scheduler docs should explain the meaning of setting a queue's weight to 
> zero
> --
>
> Key: YARN-6301
> URL: https://issues.apache.org/jira/browse/YARN-6301
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.0.0-alpha2
>Reporter: Daniel Templeton
>  Labels: docs
> Attachments: YARN-6301.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6301) Fair scheduler docs should explain the meaning of setting a queue's weight to zero

2017-03-09 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-6301:
--
Attachment: YARN-6301.001.patch

> Fair scheduler docs should explain the meaning of setting a queue's weight to 
> zero
> --
>
> Key: YARN-6301
> URL: https://issues.apache.org/jira/browse/YARN-6301
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.0.0-alpha2
>Reporter: Daniel Templeton
>  Labels: docs
> Attachments: YARN-6301.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6301) Fair scheduler docs should explain the meaning of setting a queue's weight to zero

2017-03-08 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902618#comment-15902618
 ] 

Tao Jie commented on YARN-6301:
---

[~templedf], today queue's weight is allowed to be zero even negative. It seems 
to me that the queue could not get any share more than the MinResource in that 
case, am I correct? Should we add a non-negative check here since negative 
weight of queue is more confusing?

> Fair scheduler docs should explain the meaning of setting a queue's weight to 
> zero
> --
>
> Key: YARN-6301
> URL: https://issues.apache.org/jira/browse/YARN-6301
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.0.0-alpha2
>Reporter: Daniel Templeton
>  Labels: docs
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6307) Refactor FairShareComparator#compare

2017-03-08 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902482#comment-15902482
 ] 

Tao Jie commented on YARN-6307:
---

Thank you [~yufeigu], FairShareComparator#compare is called very frequently in 
each container allocation process. It would improve the scheduler performance 
if we can simplify this method.
Furthermore, I don't think it is necessary to sort the queue hierarchy from the 
root to leafqueue in every node update. Can we do the sort in the update 
thread, then share the result for node update? It would reduce much redundant 
sort. Maybe we can improve this in another JIRA.

> Refactor FairShareComparator#compare
> 
>
> Key: YARN-6307
> URL: https://issues.apache.org/jira/browse/YARN-6307
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>
> The method did three things: check the min share ratio, check weight ratio, 
> break tied by submit time and name. They are mixed with each other which is 
> not easy to read and maintenance,  poor style. Additionally, there are 
> potential performance issues, like no need to calculate weight ratio every 
> time. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5881) Enable configuration of queue capacity in terms of absolute resources

2017-03-07 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900575#comment-15900575
 ] 

Tao Jie commented on YARN-5881:
---

Thank you [~leftnoteasy], it seems that the queue-resource configuration would 
be similar to FairScheduler with this feature. Is it possible that with the 
same configuration file, we can choose either FS or CS for scheduling? 

> Enable configuration of queue capacity in terms of absolute resources
> -
>
> Key: YARN-5881
> URL: https://issues.apache.org/jira/browse/YARN-5881
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Sean Po
>Assignee: Wangda Tan
> Attachments: 
> YARN-5881.Support.Absolute.Min.Max.Resource.In.Capacity.Scheduler.design-doc.v1.pdf
>
>
> Currently, Yarn RM supports the configuration of queue capacity in terms of a 
> proportion to cluster capacity. In the context of Yarn being used as a public 
> cloud service, it makes more sense if queues can be configured absolutely. 
> This will allow administrators to set usage limits more concretely and 
> simplify customer expectations for cluster allocation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6042) Dump scheduler and queue state information into FairScheduler DEBUG log

2017-03-02 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893880#comment-15893880
 ] 

Tao Jie commented on YARN-6042:
---

Hi [~yufeigu], dumping scheduler/queue state is very useful to detect 
scheduling problem at run-time. It seems to me that you try write 
scheduler/queue information to log file. How about print this information on 
the webui, just like we can get server stacks by a link. 

> Dump scheduler and queue state information into FairScheduler DEBUG log
> ---
>
> Key: YARN-6042
> URL: https://issues.apache.org/jira/browse/YARN-6042
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: YARN-6042.001.patch, YARN-6042.002.patch, 
> YARN-6042.003.patch, YARN-6042.004.patch, YARN-6042.005.patch, 
> YARN-6042.006.patch, YARN-6042.007.patch, YARN-6042.008.patch
>
>
> To improve the debugging of scheduler issues it would be a big improvement to 
> be able to dump the scheduler state into a log on request. 
> The Dump the scheduler state at a point in time would allow debugging of a 
> scheduler that is not hung (deadlocked) but also not assigning containers. 
> Currently we do not have a proper overview of what state the scheduler and 
> the queues are in and we have to make assumptions or guess
> The scheduler and queue state needed would include (not exhaustive):
> - instantaneous and steady fair share (app / queue)
> - AM share and resources
> - weight
> - app demand
> - application run state (runnable/non runnable)
> - last time at fair/min share



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6249) TestFairSchedulerPreemption is inconsistently failing on trunk

2017-03-02 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893722#comment-15893722
 ] 

Tao Jie commented on YARN-6249:
---

Updated the patch for [~yufeigu]'s comments. And ran 200 times again without 
failure.

> TestFairSchedulerPreemption is inconsistently failing on trunk
> --
>
> Key: YARN-6249
> URL: https://issues.apache.org/jira/browse/YARN-6249
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.9.0
>Reporter: Sean Po
>Assignee: Tao Jie
> Attachments: YARN-6249.001.patch, YARN-6249.002.patch
>
>
> Tests in TestFairSchedulerPreemption.java will inconsistently fail on trunk. 
> An example stack trace: 
> {noformat}
> Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.879 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption
> testPreemptionSelectNonAMContainer[MinSharePreemptionWithDRF](org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption)
>   Time elapsed: 10.475 sec  <<< FAILURE!
> java.lang.AssertionError: Incorrect number of containers on the greedy app 
> expected:<4> but was:<8>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:288)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionSelectNonAMContainer(TestFairSchedulerPreemption.java:363)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6249) TestFairSchedulerPreemption is inconsistently failing on trunk

2017-03-02 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-6249:
--
Attachment: YARN-6249.002.patch

> TestFairSchedulerPreemption is inconsistently failing on trunk
> --
>
> Key: YARN-6249
> URL: https://issues.apache.org/jira/browse/YARN-6249
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.9.0
>Reporter: Sean Po
>Assignee: Tao Jie
> Attachments: YARN-6249.001.patch, YARN-6249.002.patch
>
>
> Tests in TestFairSchedulerPreemption.java will inconsistently fail on trunk. 
> An example stack trace: 
> {noformat}
> Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.879 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption
> testPreemptionSelectNonAMContainer[MinSharePreemptionWithDRF](org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption)
>   Time elapsed: 10.475 sec  <<< FAILURE!
> java.lang.AssertionError: Incorrect number of containers on the greedy app 
> expected:<4> but was:<8>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:288)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionSelectNonAMContainer(TestFairSchedulerPreemption.java:363)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6249) TestFairSchedulerPreemption is inconsistently failing on trunk

2017-03-02 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie reassigned YARN-6249:
-

Assignee: Tao Jie  (was: Yufei Gu)

> TestFairSchedulerPreemption is inconsistently failing on trunk
> --
>
> Key: YARN-6249
> URL: https://issues.apache.org/jira/browse/YARN-6249
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.9.0
>Reporter: Sean Po
>Assignee: Tao Jie
> Attachments: YARN-6249.001.patch
>
>
> Tests in TestFairSchedulerPreemption.java will inconsistently fail on trunk. 
> An example stack trace: 
> {noformat}
> Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.879 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption
> testPreemptionSelectNonAMContainer[MinSharePreemptionWithDRF](org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption)
>   Time elapsed: 10.475 sec  <<< FAILURE!
> java.lang.AssertionError: Incorrect number of containers on the greedy app 
> expected:<4> but was:<8>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:288)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionSelectNonAMContainer(TestFairSchedulerPreemption.java:363)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6249) TestFairSchedulerPreemption is inconsistently failing on trunk

2017-03-02 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893644#comment-15893644
 ] 

Tao Jie commented on YARN-6249:
---

Thank you [~yufeigu] [~miklos.szeg...@cloudera.com] for your reply!
{quote}
 Would it make sense to initialize control clock before set it to scheduler 
like this?
{quote}
Agree! It makes this test more close to the real world.

> TestFairSchedulerPreemption is inconsistently failing on trunk
> --
>
> Key: YARN-6249
> URL: https://issues.apache.org/jira/browse/YARN-6249
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.9.0
>Reporter: Sean Po
>Assignee: Yufei Gu
> Attachments: YARN-6249.001.patch
>
>
> Tests in TestFairSchedulerPreemption.java will inconsistently fail on trunk. 
> An example stack trace: 
> {noformat}
> Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.879 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption
> testPreemptionSelectNonAMContainer[MinSharePreemptionWithDRF](org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption)
>   Time elapsed: 10.475 sec  <<< FAILURE!
> java.lang.AssertionError: Incorrect number of containers on the greedy app 
> expected:<4> but was:<8>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:288)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionSelectNonAMContainer(TestFairSchedulerPreemption.java:363)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6249) TestFairSchedulerPreemption is inconsistently failing on trunk

2017-03-02 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892327#comment-15892327
 ] 

Tao Jie edited comment on YARN-6249 at 3/2/17 2:32 PM:
---

In the attached patch, I called update() explicitly between app1 is allocated 
and app2 is submitted, to ensure  {{minShareStarvation}} of 
root.preemptable.child-2 is refreshed.
[~yufeigu] [~kasha],  would you take a look at it? I ran 300 times of this case 
no failure, and 3 of 100 runs would failed without this patch. 


was (Author: tao jie):
In the attached patch, I called update() explicitly between app1 is allocated 
and before app2 is submitted, to ensure  {{minShareStarvation}} of 
root.preemptable.child-2 is refreshed.
[~yufeigu] [~kasha],  would you take a look at it? I ran 300 times of this case 
no failure, and 3 of 100 runs would failed without this patch. 

> TestFairSchedulerPreemption is inconsistently failing on trunk
> --
>
> Key: YARN-6249
> URL: https://issues.apache.org/jira/browse/YARN-6249
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.9.0
>Reporter: Sean Po
>Assignee: Yufei Gu
> Attachments: YARN-6249.001.patch
>
>
> Tests in TestFairSchedulerPreemption.java will inconsistently fail on trunk. 
> An example stack trace: 
> {noformat}
> Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.879 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption
> testPreemptionSelectNonAMContainer[MinSharePreemptionWithDRF](org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption)
>   Time elapsed: 10.475 sec  <<< FAILURE!
> java.lang.AssertionError: Incorrect number of containers on the greedy app 
> expected:<4> but was:<8>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:288)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionSelectNonAMContainer(TestFairSchedulerPreemption.java:363)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6249) TestFairSchedulerPreemption is inconsistently failing on trunk

2017-03-02 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892327#comment-15892327
 ] 

Tao Jie commented on YARN-6249:
---

In the attached patch, I called update() explicitly between app1 is allocated 
and before app2 is submitted, to ensure  {{minShareStarvation}} of 
root.preemptable.child-2 is refreshed.
[~yufeigu] [~kasha],  would you take a look at it? I ran 300 times of this case 
no failure, and 3 of 100 runs would failed without this patch. 

> TestFairSchedulerPreemption is inconsistently failing on trunk
> --
>
> Key: YARN-6249
> URL: https://issues.apache.org/jira/browse/YARN-6249
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.9.0
>Reporter: Sean Po
>Assignee: Yufei Gu
> Attachments: YARN-6249.001.patch
>
>
> Tests in TestFairSchedulerPreemption.java will inconsistently fail on trunk. 
> An example stack trace: 
> {noformat}
> Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.879 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption
> testPreemptionSelectNonAMContainer[MinSharePreemptionWithDRF](org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption)
>   Time elapsed: 10.475 sec  <<< FAILURE!
> java.lang.AssertionError: Incorrect number of containers on the greedy app 
> expected:<4> but was:<8>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:288)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionSelectNonAMContainer(TestFairSchedulerPreemption.java:363)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6249) TestFairSchedulerPreemption is inconsistently failing on trunk

2017-03-02 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-6249:
--
Attachment: YARN-6249.001.patch

> TestFairSchedulerPreemption is inconsistently failing on trunk
> --
>
> Key: YARN-6249
> URL: https://issues.apache.org/jira/browse/YARN-6249
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.9.0
>Reporter: Sean Po
>Assignee: Yufei Gu
> Attachments: YARN-6249.001.patch
>
>
> Tests in TestFairSchedulerPreemption.java will inconsistently fail on trunk. 
> An example stack trace: 
> {noformat}
> Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.879 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption
> testPreemptionSelectNonAMContainer[MinSharePreemptionWithDRF](org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption)
>   Time elapsed: 10.475 sec  <<< FAILURE!
> java.lang.AssertionError: Incorrect number of containers on the greedy app 
> expected:<4> but was:<8>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:288)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionSelectNonAMContainer(TestFairSchedulerPreemption.java:363)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6249) TestFairSchedulerPreemption is inconsistently failing on trunk

2017-03-02 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892305#comment-15892305
 ] 

Tao Jie commented on YARN-6249:
---

I debugged this test and detected the root cause of the failure.
In the test, FsLeafQueues are initialized before {{scheduler.setClock(clock)}} 
is called in setup(). As a result, {{lastTimeAtMinShare}} in FsLeafQueue is 
initialized to the long value of current time(a big number), and it will 
compare to the time of {{ControlledClock}} which starts from 0.
In {{FsLeafQueue#minShareStarvation}} invoked in update()
{code}
long now = scheduler.getClock().getTime();
if (!starved) {
  // Record that the queue is not starved
  setLastTimeAtMinShare(now);
}

if (now - lastTimeAtMinShare < getMinSharePreemptionTimeout()) {
  // the queue is not starved for the preemption timeout
  starvation = Resources.clone(Resources.none());
}
{code}
If {{starved}} is true here at the first time this method is called, this queue 
would never satisfy the min preemption timeout.
However I don't think it is a bug in the real world, because this issue is 
related to ControlledClock only used in test. 


> TestFairSchedulerPreemption is inconsistently failing on trunk
> --
>
> Key: YARN-6249
> URL: https://issues.apache.org/jira/browse/YARN-6249
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.9.0
>Reporter: Sean Po
>Assignee: Yufei Gu
>
> Tests in TestFairSchedulerPreemption.java will inconsistently fail on trunk. 
> An example stack trace: 
> {noformat}
> Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.879 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption
> testPreemptionSelectNonAMContainer[MinSharePreemptionWithDRF](org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption)
>   Time elapsed: 10.475 sec  <<< FAILURE!
> java.lang.AssertionError: Incorrect number of containers on the greedy app 
> expected:<4> but was:<8>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:288)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionSelectNonAMContainer(TestFairSchedulerPreemption.java:363)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6236) Move lock() out of try-block in FairScheduler

2017-02-26 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15885267#comment-15885267
 ] 

Tao Jie commented on YARN-6236:
---

Checked files in fair folder by {{grep -R "Lock.lock()" -A 1 -B 1 ./}} and 
uploaded the patch.
[~kasha], would you give it a review?

> Move lock() out of try-block in FairScheduler
> -
>
> Key: YARN-6236
> URL: https://issues.apache.org/jira/browse/YARN-6236
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Tao Jie
>Assignee: Tao Jie
> Attachments: YARN-6236.001.patch
>
>
> As discussed in YARN-6215, {{read/writeLock.lock()}} inside the try-block is 
> widely used in existing code, especially in FairScheduler.java, eg:
> {code}
>   public ResourceWeights getAppWeight(FSAppAttempt app) {
> try {
>   readLock.lock();
>   ...
>   ...
>   return resourceWeights;
> } finally {
>   readLock.unlock();
> }
>   }
> {code}
> However in the best practice, {{lock()}} should be called outside of the 
> try-block. In case of exception happens on {{lock()}} itself, {{unlock()}} in 
> finally should not be invoked.
> We'd better to move {{lock()}} out of try-block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6236) Move lock() out of try-block in FairScheduler

2017-02-26 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-6236:
--
Component/s: fairscheduler

> Move lock() out of try-block in FairScheduler
> -
>
> Key: YARN-6236
> URL: https://issues.apache.org/jira/browse/YARN-6236
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-6236.001.patch
>
>
> As discussed in YARN-6215, {{read/writeLock.lock()}} inside the try-block is 
> widely used in existing code, especially in FairScheduler.java, eg:
> {code}
>   public ResourceWeights getAppWeight(FSAppAttempt app) {
> try {
>   readLock.lock();
>   ...
>   ...
>   return resourceWeights;
> } finally {
>   readLock.unlock();
> }
>   }
> {code}
> However in the best practice, {{lock()}} should be called outside of the 
> try-block. In case of exception happens on {{lock()}} itself, {{unlock()}} in 
> finally should not be invoked.
> We'd better to move {{lock()}} out of try-block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6236) Move lock() out of try-block in FairScheduler

2017-02-26 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-6236:
--
Priority: Minor  (was: Major)

> Move lock() out of try-block in FairScheduler
> -
>
> Key: YARN-6236
> URL: https://issues.apache.org/jira/browse/YARN-6236
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-6236.001.patch
>
>
> As discussed in YARN-6215, {{read/writeLock.lock()}} inside the try-block is 
> widely used in existing code, especially in FairScheduler.java, eg:
> {code}
>   public ResourceWeights getAppWeight(FSAppAttempt app) {
> try {
>   readLock.lock();
>   ...
>   ...
>   return resourceWeights;
> } finally {
>   readLock.unlock();
> }
>   }
> {code}
> However in the best practice, {{lock()}} should be called outside of the 
> try-block. In case of exception happens on {{lock()}} itself, {{unlock()}} in 
> finally should not be invoked.
> We'd better to move {{lock()}} out of try-block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6236) Move lock() out of try-block in FairScheduler

2017-02-26 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-6236:
--
Attachment: YARN-6236.001.patch

> Move lock() out of try-block in FairScheduler
> -
>
> Key: YARN-6236
> URL: https://issues.apache.org/jira/browse/YARN-6236
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Tao Jie
>Assignee: Tao Jie
> Attachments: YARN-6236.001.patch
>
>
> As discussed in YARN-6215, {{read/writeLock.lock()}} inside the try-block is 
> widely used in existing code, especially in FairScheduler.java, eg:
> {code}
>   public ResourceWeights getAppWeight(FSAppAttempt app) {
> try {
>   readLock.lock();
>   ...
>   ...
>   return resourceWeights;
> } finally {
>   readLock.unlock();
> }
>   }
> {code}
> However in the best practice, {{lock()}} should be called outside of the 
> try-block. In case of exception happens on {{lock()}} itself, {{unlock()}} in 
> finally should not be invoked.
> We'd better to move {{lock()}} out of try-block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6236) Move lock() out of try-block in FairScheduler

2017-02-26 Thread Tao Jie (JIRA)
Tao Jie created YARN-6236:
-

 Summary: Move lock() out of try-block in FairScheduler
 Key: YARN-6236
 URL: https://issues.apache.org/jira/browse/YARN-6236
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Tao Jie
Assignee: Tao Jie


As discussed in YARN-6215, {{read/writeLock.lock()}} inside the try-block is 
widely used in existing code, especially in FairScheduler.java, eg:
{code}
  public ResourceWeights getAppWeight(FSAppAttempt app) {
try {
  readLock.lock();
  ...
  ...
  return resourceWeights;
} finally {
  readLock.unlock();
}
  }
{code}
However in the best practice, {{lock()}} should be called outside of the 
try-block. In case of exception happens on {{lock()}} itself, {{unlock()}} in 
finally should not be invoked.
We'd better to move {{lock()}} out of try-block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6215) TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in trunk

2017-02-26 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15885080#comment-15885080
 ] 

Tao Jie commented on YARN-6215:
---

[~kasha] thank you for you comments and patch updated.
{quote}
lock() should be called outside the try-block
{quote}
I didn't think much about it before. {{lock()}} inside and outside the 
try-block both exist in current code. I checked some discussions at 
stackOverFlow, {{lock()}} itself does not throw checked exception but in case 
it throws unchecked exception(maybe it would hardly happen), {{unlock()}} 
should not be invoked. So {{lock()}} outside try-block is the better practice. 
Is it necessary to move existing {{lock()}} outside try-block? At least in 
{{FairScheduler.java}} most {{lock()}} is inside try-block now.

> TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in 
> trunk
> 
>
> Key: YARN-6215
> URL: https://issues.apache.org/jira/browse/YARN-6215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler, test
>Reporter: Sunil G
>Assignee: Tao Jie
> Attachments: YARN-6215.001.patch, YARN-6215.002.patch
>
>
> *Error Message*
> Incorrect number of containers on the greedy app expected:<4> but was:<8>
> Failed test case 
> [link|https://builds.apache.org/job/PreCommit-YARN-Build/15038/testReport/org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair/TestFairSchedulerPreemption/testPreemptionBetweenNonSiblingQueues_FairSharePreemptionWithDRF_/]
> *Stacktrace*
> {noformat}
> java.lang.AssertionError: Incorrect number of containers on the greedy app 
> expected:<4> but was:<8>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues(TestFairSchedulerPreemption.java:323)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6215) TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in trunk

2017-02-26 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-6215:
--
Attachment: (was: YARN-6215.002.patch)

> TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in 
> trunk
> 
>
> Key: YARN-6215
> URL: https://issues.apache.org/jira/browse/YARN-6215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler, test
>Reporter: Sunil G
>Assignee: Tao Jie
> Attachments: YARN-6215.001.patch, YARN-6215.002.patch
>
>
> *Error Message*
> Incorrect number of containers on the greedy app expected:<4> but was:<8>
> Failed test case 
> [link|https://builds.apache.org/job/PreCommit-YARN-Build/15038/testReport/org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair/TestFairSchedulerPreemption/testPreemptionBetweenNonSiblingQueues_FairSharePreemptionWithDRF_/]
> *Stacktrace*
> {noformat}
> java.lang.AssertionError: Incorrect number of containers on the greedy app 
> expected:<4> but was:<8>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues(TestFairSchedulerPreemption.java:323)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6215) TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in trunk

2017-02-26 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-6215:
--
Attachment: YARN-6215.002.patch

> TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in 
> trunk
> 
>
> Key: YARN-6215
> URL: https://issues.apache.org/jira/browse/YARN-6215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler, test
>Reporter: Sunil G
>Assignee: Tao Jie
> Attachments: YARN-6215.001.patch, YARN-6215.002.patch
>
>
> *Error Message*
> Incorrect number of containers on the greedy app expected:<4> but was:<8>
> Failed test case 
> [link|https://builds.apache.org/job/PreCommit-YARN-Build/15038/testReport/org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair/TestFairSchedulerPreemption/testPreemptionBetweenNonSiblingQueues_FairSharePreemptionWithDRF_/]
> *Stacktrace*
> {noformat}
> java.lang.AssertionError: Incorrect number of containers on the greedy app 
> expected:<4> but was:<8>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues(TestFairSchedulerPreemption.java:323)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6215) TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in trunk

2017-02-26 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-6215:
--
Attachment: YARN-6215.002.patch

> TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in 
> trunk
> 
>
> Key: YARN-6215
> URL: https://issues.apache.org/jira/browse/YARN-6215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler, test
>Reporter: Sunil G
>Assignee: Tao Jie
> Attachments: YARN-6215.001.patch, YARN-6215.002.patch
>
>
> *Error Message*
> Incorrect number of containers on the greedy app expected:<4> but was:<8>
> Failed test case 
> [link|https://builds.apache.org/job/PreCommit-YARN-Build/15038/testReport/org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair/TestFairSchedulerPreemption/testPreemptionBetweenNonSiblingQueues_FairSharePreemptionWithDRF_/]
> *Stacktrace*
> {noformat}
> java.lang.AssertionError: Incorrect number of containers on the greedy app 
> expected:<4> but was:<8>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues(TestFairSchedulerPreemption.java:323)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6215) TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in trunk

2017-02-23 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881721#comment-15881721
 ] 

Tao Jie commented on YARN-6215:
---

[~kasha] thank you for you comment!
I understand your concern. It may bring some performance loss if we add a 
read-lock, but it is risky if we don't. Doing preemption when fairshare of 
queues is incomplete would not only miss containers that should be preempted, 
but also may preempt containers that should not be preempted by mistake. It 
would be unpredictable.
In earlier FS code, updating fairshare and preempting containers took place in 
one thread, so I think a readlock here would not  make the performance worse.

> TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in 
> trunk
> 
>
> Key: YARN-6215
> URL: https://issues.apache.org/jira/browse/YARN-6215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler, test
>Reporter: Sunil G
>Assignee: Tao Jie
> Attachments: YARN-6215.001.patch
>
>
> *Error Message*
> Incorrect number of containers on the greedy app expected:<4> but was:<8>
> Failed test case 
> [link|https://builds.apache.org/job/PreCommit-YARN-Build/15038/testReport/org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair/TestFairSchedulerPreemption/testPreemptionBetweenNonSiblingQueues_FairSharePreemptionWithDRF_/]
> *Stacktrace*
> {noformat}
> java.lang.AssertionError: Incorrect number of containers on the greedy app 
> expected:<4> but was:<8>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues(TestFairSchedulerPreemption.java:323)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6215) TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in trunk

2017-02-23 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880886#comment-15880886
 ] 

Tao Jie commented on YARN-6215:
---

[~kasha] [~yufeigu] [~sunilg], I uploaded a patch that add a readlock on 
FsPreemptionThread. Would you mind giving it a review?

> TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in 
> trunk
> 
>
> Key: YARN-6215
> URL: https://issues.apache.org/jira/browse/YARN-6215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler, test
>Reporter: Sunil G
>Assignee: Tao Jie
> Attachments: YARN-6215.001.patch
>
>
> *Error Message*
> Incorrect number of containers on the greedy app expected:<4> but was:<8>
> Failed test case 
> [link|https://builds.apache.org/job/PreCommit-YARN-Build/15038/testReport/org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair/TestFairSchedulerPreemption/testPreemptionBetweenNonSiblingQueues_FairSharePreemptionWithDRF_/]
> *Stacktrace*
> {noformat}
> java.lang.AssertionError: Incorrect number of containers on the greedy app 
> expected:<4> but was:<8>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues(TestFairSchedulerPreemption.java:323)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6215) TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in trunk

2017-02-23 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-6215:
--
Attachment: YARN-6215.001.patch

> TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in 
> trunk
> 
>
> Key: YARN-6215
> URL: https://issues.apache.org/jira/browse/YARN-6215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler, test
>Reporter: Sunil G
>Assignee: Tao Jie
> Attachments: YARN-6215.001.patch
>
>
> *Error Message*
> Incorrect number of containers on the greedy app expected:<4> but was:<8>
> Failed test case 
> [link|https://builds.apache.org/job/PreCommit-YARN-Build/15038/testReport/org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair/TestFairSchedulerPreemption/testPreemptionBetweenNonSiblingQueues_FairSharePreemptionWithDRF_/]
> *Stacktrace*
> {noformat}
> java.lang.AssertionError: Incorrect number of containers on the greedy app 
> expected:<4> but was:<8>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues(TestFairSchedulerPreemption.java:323)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6215) TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in trunk

2017-02-23 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880872#comment-15880872
 ] 

Tao Jie commented on YARN-6215:
---

I debugged this unittest and found out it is because of reentry of updateThread 
and preemptionThread.
In updateThread, it goes through all queues and trigger the preemptionThread 
once it finds a app is starved. At this moment, a few queues has been updated 
while others are not. And the preemptionThread will try to find container to 
preempt in the incomplete state.
Today  updateThread is under writeLock of Fairscheduler, as a result, we need 
to add a readLock of FS on the preemptionThread at the same time.


> TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in 
> trunk
> 
>
> Key: YARN-6215
> URL: https://issues.apache.org/jira/browse/YARN-6215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler, test
>Reporter: Sunil G
>Assignee: Tao Jie
>
> *Error Message*
> Incorrect number of containers on the greedy app expected:<4> but was:<8>
> Failed test case 
> [link|https://builds.apache.org/job/PreCommit-YARN-Build/15038/testReport/org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair/TestFairSchedulerPreemption/testPreemptionBetweenNonSiblingQueues_FairSharePreemptionWithDRF_/]
> *Stacktrace*
> {noformat}
> java.lang.AssertionError: Incorrect number of containers on the greedy app 
> expected:<4> but was:<8>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues(TestFairSchedulerPreemption.java:323)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6215) TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in trunk

2017-02-23 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie reassigned YARN-6215:
-

Assignee: Tao Jie

> TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues fails in 
> trunk
> 
>
> Key: YARN-6215
> URL: https://issues.apache.org/jira/browse/YARN-6215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler, test
>Reporter: Sunil G
>Assignee: Tao Jie
>
> *Error Message*
> Incorrect number of containers on the greedy app expected:<4> but was:<8>
> Failed test case 
> [link|https://builds.apache.org/job/PreCommit-YARN-Build/15038/testReport/org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair/TestFairSchedulerPreemption/testPreemptionBetweenNonSiblingQueues_FairSharePreemptionWithDRF_/]
> *Stacktrace*
> {noformat}
> java.lang.AssertionError: Incorrect number of containers on the greedy app 
> expected:<4> but was:<8>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionBetweenNonSiblingQueues(TestFairSchedulerPreemption.java:323)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6225) Global scheduler applies to Fair scheduler

2017-02-22 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-6225:
--
Summary: Global scheduler applies to Fair scheduler  (was: Global scheduler 
apply to Fair scheduler)

> Global scheduler applies to Fair scheduler
> --
>
> Key: YARN-6225
> URL: https://issues.apache.org/jira/browse/YARN-6225
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Tao Jie
>
> IIRC in global scheduling, logic for scheduling constraint such as nodelabel, 
> affinity/anti-affinity would take place before the scheduler try to commit 
> ResourceCommitRequest. This logic looks can be shared by FairScheduler and 
> CapacityScheduler.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6225) Global scheduler apply to Fair scheduler

2017-02-22 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie reassigned YARN-6225:
-

Assignee: Tao Jie

> Global scheduler apply to Fair scheduler
> 
>
> Key: YARN-6225
> URL: https://issues.apache.org/jira/browse/YARN-6225
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Tao Jie
>Assignee: Tao Jie
>
> IIRC in global scheduling, logic for scheduling constraint such as nodelabel, 
> affinity/anti-affinity would take place before the scheduler try to commit 
> ResourceCommitRequest. This logic looks can be shared by FairScheduler and 
> CapacityScheduler.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6225) Global scheduler apply to Fair scheduler

2017-02-22 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie reassigned YARN-6225:
-

Assignee: (was: Tao Jie)

> Global scheduler apply to Fair scheduler
> 
>
> Key: YARN-6225
> URL: https://issues.apache.org/jira/browse/YARN-6225
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Tao Jie
>
> IIRC in global scheduling, logic for scheduling constraint such as nodelabel, 
> affinity/anti-affinity would take place before the scheduler try to commit 
> ResourceCommitRequest. This logic looks can be shared by FairScheduler and 
> CapacityScheduler.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6225) Global scheduler apply to Fair scheduler

2017-02-22 Thread Tao Jie (JIRA)
Tao Jie created YARN-6225:
-

 Summary: Global scheduler apply to Fair scheduler
 Key: YARN-6225
 URL: https://issues.apache.org/jira/browse/YARN-6225
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Tao Jie


IIRC in global scheduling, logic for scheduling constraint such as nodelabel, 
affinity/anti-affinity would take place before the scheduler try to commit 
ResourceCommitRequest. This logic looks can be shared by FairScheduler and 
CapacityScheduler.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6224) Should consider utilization of each ResourceType on node while scheduling

2017-02-22 Thread Tao Jie (JIRA)
Tao Jie created YARN-6224:
-

 Summary: Should consider utilization of each ResourceType on node 
while scheduling
 Key: YARN-6224
 URL: https://issues.apache.org/jira/browse/YARN-6224
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Tao Jie


In situation like YARN-6101, if we consider all type of resource(vcore, memory) 
utilization on node rather than just answer we can allocate or not, we are more 
likely to have better resource utilization as a whole.
It is possible that we have a set of candidate nodes, then find the most 
promising node to assign to one request considering node resource utilization 
with global scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6101) Delay scheduling for node resource balance

2017-02-22 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877833#comment-15877833
 ] 

Tao Jie commented on YARN-6101:
---

[~He Tianyi], thank you for sharing your case.
Today scheduling is triggered by NM heartbeat, that is once one NM come, the 
scheduler select  containers to assign to this NM. It is difficult to find the 
global best node to run container for applications. It seems that YARN-5139 
improves the scheduling logic, which is first we find a set of candidate nodes 
for each resource request, then we have NodeScorer to measure which node is the 
best to allocate. In this case, node's utilization should be considered.


> Delay scheduling for node resource balance
> --
>
> Key: YARN-6101
> URL: https://issues.apache.org/jira/browse/YARN-6101
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: He Tianyi
>Priority: Minor
> Attachments: YARN-6101.preliminary..patch
>
>
> We observed that, in today's cluster, usage of Spark has dramatically 
> increased. 
> This introduced a new issue that CPU/MEM utilization for single node may 
> become imbalanced due to Spark is generally more memory intensive. For 
> example, after a node with capability (48 cores, 192 GB memory) cannot 
> satisfy a (1 core, 2 GB memory) request if current used resource is (20 
> cores, 191 GB memory), with plenty of total available resource across the 
> whole cluster.
> A thought for avoiding the situation is to introduce some strategy during 
> scheduling.
> This JIRA proposes a delay-scheduling-alike approach to achieve better 
> balance between different type of resources on each node.
> The basic idea is consider dominant resource for each node, and when a 
> scheduling opportunity on a particular node is offered to a resource request, 
> better make sure the allocation is changing dominant resource of the node, 
> or, in worst case, allocate at once when number of offered scheduling 
> opportunities exceeds a certain number.
> With YARN SLS and a simulation file with hybrid workload (MR+Spark), the 
> approach improved cluster resource usage by nearly 5%. And after deployed to 
> production, we observed a 8% improvement.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5829) FS preemption should reserve a node before considering containers on it for preemption

2017-02-16 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871054#comment-15871054
 ] 

Tao Jie commented on YARN-5829:
---

[~kasha], it seems similar situation as mentioned in YARN-5636.
Should we have a common mechanism that supports "reserving certain resource on 
a certain node for a certain app for a while" ?

> FS preemption should reserve a node before considering containers on it for 
> preemption
> --
>
> Key: YARN-5829
> URL: https://issues.apache.org/jira/browse/YARN-5829
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>
> FS preemption evaluates nodes for preemption, and subsequently preempts 
> identified containers. If this node is not reserved for a specific 
> application, any other application could be allocated resources on this node. 
> Reserving the node for the starved application before preempting containers 
> would help avoid this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2497) Changes for fair scheduler to support allocate resource respect labels

2017-02-16 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869650#comment-15869650
 ] 

Tao Jie commented on YARN-2497:
---

We already have a implement of supporting nodelabel for FS but it is based on 
earlier hadoop version. I would like to rebase the patch since preemption logic 
for fairschuduler has been refactored in YARN-4752.
[~kasha],  would you mind if I take this JIRA over? 


> Changes for fair scheduler to support allocate resource respect labels
> --
>
> Key: YARN-2497
> URL: https://issues.apache.org/jira/browse/YARN-2497
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Wangda Tan
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6000) Set modifier of interface Listener in AllocationFileLoaderService to public

2016-12-14 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749979#comment-15749979
 ] 

Tao Jie commented on YARN-6000:
---

[~templedf], [~kasha] would you mind taking a look at it?

> Set modifier of interface Listener in AllocationFileLoaderService to public
> ---
>
> Key: YARN-6000
> URL: https://issues.apache.org/jira/browse/YARN-6000
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, yarn
>Affects Versions: 3.0.0-alpha1
>Reporter: Tao Jie
>Assignee: Tao Jie
> Attachments: YARN-6000.001.patch
>
>
> We removed public modifier of {{AllocationFileLoaderService.Listener}} in 
> YARN-4997 since it trigger a findbugs warning. However it breaks Hive code in 
> {{FairSchedulerShim}}. 
> {code}
> AllocationFileLoaderService allocsLoader = new AllocationFileLoaderService();
> allocsLoader.init(conf);
> allocsLoader.setReloadListener(new AllocationFileLoaderService.Listener() 
> {
>   @Override
>   public void onReload(AllocationConfiguration allocs) {
> allocConf.set(allocs);
>   }
> });
> try {
>   allocsLoader.reloadAllocations();
> } catch (Exception ex) {
>   throw new IOException("Failed to load queue allocations", ex);
> }
> if (allocConf.get() == null) {
>   allocConf.set(new AllocationConfiguration(conf));
> }
> QueuePlacementPolicy queuePolicy = allocConf.get().getPlacementPolicy();
> if (queuePolicy != null) {
>   requestedQueue = queuePolicy.assignAppToQueue(requestedQueue, userName);
> {code}
> As a result we should set the modifier back to public.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6000) Set modifier of interface Listener in AllocationFileLoaderService to public

2016-12-13 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-6000:
--
Attachment: YARN-6000.001.patch

> Set modifier of interface Listener in AllocationFileLoaderService to public
> ---
>
> Key: YARN-6000
> URL: https://issues.apache.org/jira/browse/YARN-6000
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, yarn
>Affects Versions: 3.0.0-alpha1
>Reporter: Tao Jie
>Assignee: Tao Jie
> Attachments: YARN-6000.001.patch
>
>
> We removed public modifier of {{AllocationFileLoaderService.Listener}} in 
> YARN-4997 since it trigger a findbugs warning. However it breaks Hive code in 
> {{FairSchedulerShim}}. 
> {code}
> AllocationFileLoaderService allocsLoader = new AllocationFileLoaderService();
> allocsLoader.init(conf);
> allocsLoader.setReloadListener(new AllocationFileLoaderService.Listener() 
> {
>   @Override
>   public void onReload(AllocationConfiguration allocs) {
> allocConf.set(allocs);
>   }
> });
> try {
>   allocsLoader.reloadAllocations();
> } catch (Exception ex) {
>   throw new IOException("Failed to load queue allocations", ex);
> }
> if (allocConf.get() == null) {
>   allocConf.set(new AllocationConfiguration(conf));
> }
> QueuePlacementPolicy queuePolicy = allocConf.get().getPlacementPolicy();
> if (queuePolicy != null) {
>   requestedQueue = queuePolicy.assignAppToQueue(requestedQueue, userName);
> {code}
> As a result we should set the modifier back to public.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4997) Update fair scheduler to use pluggable auth provider

2016-12-13 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746988#comment-15746988
 ] 

Tao Jie commented on YARN-4997:
---

Thank you [~sershe], I have created another JIRA YARN-6000 to handle this.
It's OK if you change Hive code to walk around and make the logic more clear. 
However once it break the code, we should get it fixed. Otherwise, when we 
update the Hadoop version, but not Hive(maybe have not released yet), it would 
fail.


> Update fair scheduler to use pluggable auth provider
> 
>
> Key: YARN-4997
> URL: https://issues.apache.org/jira/browse/YARN-4997
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Tao Jie
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-4997-001.patch, YARN-4997-002.patch, 
> YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, 
> YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, 
> YARN-4997-009.patch, YARN-4997-010.patch, YARN-4997-011.patch
>
>
> Now that YARN-3100 has made the authorization pluggable, it should be 
> supported by the fair scheduler.  YARN-3100 only updated the capacity 
> scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6000) Set modifier of interface Listener in AllocationFileLoaderService to public

2016-12-13 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie reassigned YARN-6000:
-

Assignee: Tao Jie

> Set modifier of interface Listener in AllocationFileLoaderService to public
> ---
>
> Key: YARN-6000
> URL: https://issues.apache.org/jira/browse/YARN-6000
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, yarn
>Affects Versions: 3.0.0-alpha1
>Reporter: Tao Jie
>Assignee: Tao Jie
>
> We removed public modifier of {{AllocationFileLoaderService.Listener}} in 
> YARN-4997 since it trigger a findbugs warning. However it breaks Hive code in 
> {{FairSchedulerShim}}. 
> {code}
> AllocationFileLoaderService allocsLoader = new AllocationFileLoaderService();
> allocsLoader.init(conf);
> allocsLoader.setReloadListener(new AllocationFileLoaderService.Listener() 
> {
>   @Override
>   public void onReload(AllocationConfiguration allocs) {
> allocConf.set(allocs);
>   }
> });
> try {
>   allocsLoader.reloadAllocations();
> } catch (Exception ex) {
>   throw new IOException("Failed to load queue allocations", ex);
> }
> if (allocConf.get() == null) {
>   allocConf.set(new AllocationConfiguration(conf));
> }
> QueuePlacementPolicy queuePolicy = allocConf.get().getPlacementPolicy();
> if (queuePolicy != null) {
>   requestedQueue = queuePolicy.assignAppToQueue(requestedQueue, userName);
> {code}
> As a result we should set the modifier back to public.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6000) Set modifier of interface Listener in AllocationFileLoaderService to public

2016-12-13 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-6000:
--
Component/s: yarn
 fairscheduler

> Set modifier of interface Listener in AllocationFileLoaderService to public
> ---
>
> Key: YARN-6000
> URL: https://issues.apache.org/jira/browse/YARN-6000
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, yarn
>Affects Versions: 3.0.0-alpha1
>Reporter: Tao Jie
>
> We removed public modifier of {{AllocationFileLoaderService.Listener}} in 
> YARN-4997 since it trigger a findbugs warning. However it breaks Hive code in 
> {{FairSchedulerShim}}. 
> {code}
> AllocationFileLoaderService allocsLoader = new AllocationFileLoaderService();
> allocsLoader.init(conf);
> allocsLoader.setReloadListener(new AllocationFileLoaderService.Listener() 
> {
>   @Override
>   public void onReload(AllocationConfiguration allocs) {
> allocConf.set(allocs);
>   }
> });
> try {
>   allocsLoader.reloadAllocations();
> } catch (Exception ex) {
>   throw new IOException("Failed to load queue allocations", ex);
> }
> if (allocConf.get() == null) {
>   allocConf.set(new AllocationConfiguration(conf));
> }
> QueuePlacementPolicy queuePolicy = allocConf.get().getPlacementPolicy();
> if (queuePolicy != null) {
>   requestedQueue = queuePolicy.assignAppToQueue(requestedQueue, userName);
> {code}
> As a result we should set the modifier back to public.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6000) Set modifier of interface Listener in AllocationFileLoaderService to public

2016-12-13 Thread Tao Jie (JIRA)
Tao Jie created YARN-6000:
-

 Summary: Set modifier of interface Listener in 
AllocationFileLoaderService to public
 Key: YARN-6000
 URL: https://issues.apache.org/jira/browse/YARN-6000
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Tao Jie


We removed public modifier of {{AllocationFileLoaderService.Listener}} in 
YARN-4997 since it trigger a findbugs warning. However it breaks Hive code in 
{{FairSchedulerShim}}. 
{code}
AllocationFileLoaderService allocsLoader = new AllocationFileLoaderService();
allocsLoader.init(conf);
allocsLoader.setReloadListener(new AllocationFileLoaderService.Listener() {
  @Override
  public void onReload(AllocationConfiguration allocs) {
allocConf.set(allocs);
  }
});
try {
  allocsLoader.reloadAllocations();
} catch (Exception ex) {
  throw new IOException("Failed to load queue allocations", ex);
}
if (allocConf.get() == null) {
  allocConf.set(new AllocationConfiguration(conf));
}
QueuePlacementPolicy queuePolicy = allocConf.get().getPlacementPolicy();
if (queuePolicy != null) {
  requestedQueue = queuePolicy.assignAppToQueue(requestedQueue, userName);
{code}
As a result we should set the modifier back to public.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6000) Set modifier of interface Listener in AllocationFileLoaderService to public

2016-12-13 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-6000:
--
Affects Version/s: 3.0.0-alpha1

> Set modifier of interface Listener in AllocationFileLoaderService to public
> ---
>
> Key: YARN-6000
> URL: https://issues.apache.org/jira/browse/YARN-6000
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Tao Jie
>
> We removed public modifier of {{AllocationFileLoaderService.Listener}} in 
> YARN-4997 since it trigger a findbugs warning. However it breaks Hive code in 
> {{FairSchedulerShim}}. 
> {code}
> AllocationFileLoaderService allocsLoader = new AllocationFileLoaderService();
> allocsLoader.init(conf);
> allocsLoader.setReloadListener(new AllocationFileLoaderService.Listener() 
> {
>   @Override
>   public void onReload(AllocationConfiguration allocs) {
> allocConf.set(allocs);
>   }
> });
> try {
>   allocsLoader.reloadAllocations();
> } catch (Exception ex) {
>   throw new IOException("Failed to load queue allocations", ex);
> }
> if (allocConf.get() == null) {
>   allocConf.set(new AllocationConfiguration(conf));
> }
> QueuePlacementPolicy queuePolicy = allocConf.get().getPlacementPolicy();
> if (queuePolicy != null) {
>   requestedQueue = queuePolicy.assignAppToQueue(requestedQueue, userName);
> {code}
> As a result we should set the modifier back to public.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4997) Update fair scheduler to use pluggable auth provider

2016-12-13 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746936#comment-15746936
 ] 

Tao Jie commented on YARN-4997:
---

[~sershe], we have discussed about the modifier of {{interface Listener}} 
earlier in this patch.  We removed public on interface Listener since it got a 
findbugs warning, and found {{public}} here is not necessary.
Since this breaks Hive code, I prefer to add {{public}} back to Listener.

> Update fair scheduler to use pluggable auth provider
> 
>
> Key: YARN-4997
> URL: https://issues.apache.org/jira/browse/YARN-4997
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Tao Jie
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-4997-001.patch, YARN-4997-002.patch, 
> YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, 
> YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, 
> YARN-4997-009.patch, YARN-4997-010.patch, YARN-4997-011.patch
>
>
> Now that YARN-3100 has made the authorization pluggable, it should be 
> supported by the fair scheduler.  YARN-3100 only updated the capacity 
> scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4997) Update fair scheduler to use pluggable auth provider

2016-11-29 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707795#comment-15707795
 ] 

Tao Jie commented on YARN-4997:
---

[~templedf], thanks for your patient review and sorry for inaccurate 
understanding of your comments.
Updated the patch..

> Update fair scheduler to use pluggable auth provider
> 
>
> Key: YARN-4997
> URL: https://issues.apache.org/jira/browse/YARN-4997
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Tao Jie
> Attachments: YARN-4997-001.patch, YARN-4997-002.patch, 
> YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, 
> YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, 
> YARN-4997-009.patch, YARN-4997-010.patch, YARN-4997-011.patch
>
>
> Now that YARN-3100 has made the authorization pluggable, it should be 
> supported by the fair scheduler.  YARN-3100 only updated the capacity 
> scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4997) Update fair scheduler to use pluggable auth provider

2016-11-29 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-4997:
--
Attachment: YARN-4997-011.patch

> Update fair scheduler to use pluggable auth provider
> 
>
> Key: YARN-4997
> URL: https://issues.apache.org/jira/browse/YARN-4997
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Tao Jie
> Attachments: YARN-4997-001.patch, YARN-4997-002.patch, 
> YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, 
> YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, 
> YARN-4997-009.patch, YARN-4997-010.patch, YARN-4997-011.patch
>
>
> Now that YARN-3100 has made the authorization pluggable, it should be 
> supported by the fair scheduler.  YARN-3100 only updated the capacity 
> scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4997) Update fair scheduler to use pluggable auth provider

2016-11-28 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15704462#comment-15704462
 ] 

Tao Jie commented on YARN-4997:
---

Updated the patch respect to [~templedf]'s comment. The test failure is 
irrelevant.

> Update fair scheduler to use pluggable auth provider
> 
>
> Key: YARN-4997
> URL: https://issues.apache.org/jira/browse/YARN-4997
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Tao Jie
> Attachments: YARN-4997-001.patch, YARN-4997-002.patch, 
> YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, 
> YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, 
> YARN-4997-009.patch, YARN-4997-010.patch
>
>
> Now that YARN-3100 has made the authorization pluggable, it should be 
> supported by the fair scheduler.  YARN-3100 only updated the capacity 
> scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4997) Update fair scheduler to use pluggable auth provider

2016-11-28 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-4997:
--
Attachment: YARN-4997-010.patch

> Update fair scheduler to use pluggable auth provider
> 
>
> Key: YARN-4997
> URL: https://issues.apache.org/jira/browse/YARN-4997
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Tao Jie
> Attachments: YARN-4997-001.patch, YARN-4997-002.patch, 
> YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, 
> YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, 
> YARN-4997-009.patch, YARN-4997-010.patch
>
>
> Now that YARN-3100 has made the authorization pluggable, it should be 
> supported by the fair scheduler.  YARN-3100 only updated the capacity 
> scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5040) CPU Isolation with CGroups triggers kernel panics on Centos 7.1/7.2 when yarn.nodemanager.resource.percentage-physical-cpu-limit < 100

2016-11-22 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15688554#comment-15688554
 ] 

Tao Jie commented on YARN-5040:
---

We have met the same problem. We set 
yarn.nodemanager.resource.percentage-physical-cpu-limit=80 and tested both 
kernel version 2.6.32-642 and 3.10.103 with hadoop-2.7.1 by running Terasort, 
the kernel crashed.
Then we updated kernel version to 4.8.1, the kernel panic didn't happen any 
more. It seems that this kernel panic is due to kernel cgroup bug, which is 
fixed in higher kernel version. 

> CPU Isolation with CGroups triggers kernel panics on Centos 7.1/7.2 when 
> yarn.nodemanager.resource.percentage-physical-cpu-limit < 100
> --
>
> Key: YARN-5040
> URL: https://issues.apache.org/jira/browse/YARN-5040
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Sidharta Seethana
>Assignee: Varun Vasudev
>
> /cc [~vvasudev]
> We have been running some benchmarks internally with resource isolation 
> enabled. We have consistently run into kernel panics when running a large job 
> ( a large pi job, terasort ). These kernel panics wen't away when we set 
> yarn.nodemanager.resource.percentage-physical-cpu-limit=100 . Anything less 
> than 100 triggers different behavior in YARN's CPU resource handler which 
> seems to cause these issues. Looking at the kernel crash dumps, the 
> backtraces were different - sometimes pointing to java processes, sometimes 
> not. 
> Kernel versions used : 3.10.0-229.14.1.el7.x86_64 and 
> 3.10.0-327.13.1.el7.x86_64 . 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2497) Changes for fair scheduler to support allocate resource respect labels

2016-11-03 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634894#comment-15634894
 ] 

Tao Jie commented on YARN-2497:
---

Any updates for this jira? It might be very useful to me.

> Changes for fair scheduler to support allocate resource respect labels
> --
>
> Key: YARN-2497
> URL: https://issues.apache.org/jira/browse/YARN-2497
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"

2016-11-03 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15632064#comment-15632064
 ] 

Tao Jie commented on YARN-5720:
---

It seems that nodeLabel related commends is not included in YarnCommands.html 
in branch-2.8.

> Update document for "rmadmin -replaceLabelOnNode"
> -
>
> Key: YARN-5720
> URL: https://issues.apache.org/jira/browse/YARN-5720
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-5720-branch-2.8.patch, YARN-5720.001.patch, 
> YARN-5720.002.patch, YarnCommands.png, nodeLabel.png
>
>
> As mentioned in YARN-4855, document should be updated since commands has 
> changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"

2016-11-03 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5720:
--
Attachment: YARN-5720-branch-2.8.patch

> Update document for "rmadmin -replaceLabelOnNode"
> -
>
> Key: YARN-5720
> URL: https://issues.apache.org/jira/browse/YARN-5720
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-5720-branch-2.8.patch, YARN-5720.001.patch, 
> YARN-5720.002.patch, YarnCommands.png, nodeLabel.png
>
>
> As mentioned in YARN-4855, document should be updated since commands has 
> changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5552) Add Builder methods for common yarn API records

2016-11-02 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5552:
--
Attachment: YARN-5552.009.patch

> Add Builder methods for common yarn API records
> ---
>
> Key: YARN-5552
> URL: https://issues.apache.org/jira/browse/YARN-5552
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Tao Jie
> Attachments: YARN-5552.000.patch, YARN-5552.001.patch, 
> YARN-5552.002.patch, YARN-5552.003.patch, YARN-5552.004.patch, 
> YARN-5552.005.patch, YARN-5552.006.patch, YARN-5552.007.patch, 
> YARN-5552.008.patch, YARN-5552.009.patch
>
>
> Currently yarn API records such as ResourceRequest, AllocateRequest/Respone 
> as well as AMRMClient.ContainerRequest have multiple constructors / 
> newInstance methods. This makes it very difficult to add new fields to these 
> records.
> It would probably be better if we had Builder classes for many of these 
> records, which would make evolution of these records a bit easier.
> (suggested by [~kasha])



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5697) Use CliParser to parse options in RMAdminCLI

2016-11-02 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15628123#comment-15628123
 ] 

Tao Jie commented on YARN-5697:
---

Hi [~Naganarasimha],
I checked test log and it seems that test case failures are due to test 
environment:
{quote}
testNonExistentUser(org.apache.hadoop.yarn.client.TestGetGroups)  Time elapsed: 
0.004 sec  <<< ERROR!
java.net.UnknownHostException: Invalid host name: local host is: (unknown); 
destination host is: "7ed7e992eec3":8033; java.net.UnknownHostException; For 
more details see:  http://wiki.apache.org/hadoop/UnknownHost
{quote}
Also I run failed test cases on my local environment, all of them are fine.

> Use CliParser to parse options in RMAdminCLI
> 
>
> Key: YARN-5697
> URL: https://issues.apache.org/jira/browse/YARN-5697
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
> Attachments: YARN-5697.001.patch, YARN-5697.002.patch, 
> YARN-5697.003.patch, YARN-5697.004.patch, YARN-5697.005-branch-2.8.patch, 
> YARN-5697.005.patch
>
>
> As discussed in YARN-4855, it is better to use CliParser rather than args to 
> parse command line options in RMAdminCli.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5552) Add Builder methods for common yarn API records

2016-11-02 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5552:
--
Attachment: YARN-5552.008.patch

> Add Builder methods for common yarn API records
> ---
>
> Key: YARN-5552
> URL: https://issues.apache.org/jira/browse/YARN-5552
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Tao Jie
> Attachments: YARN-5552.000.patch, YARN-5552.001.patch, 
> YARN-5552.002.patch, YARN-5552.003.patch, YARN-5552.004.patch, 
> YARN-5552.005.patch, YARN-5552.006.patch, YARN-5552.007.patch, 
> YARN-5552.008.patch
>
>
> Currently yarn API records such as ResourceRequest, AllocateRequest/Respone 
> as well as AMRMClient.ContainerRequest have multiple constructors / 
> newInstance methods. This makes it very difficult to add new fields to these 
> records.
> It would probably be better if we had Builder classes for many of these 
> records, which would make evolution of these records a bit easier.
> (suggested by [~kasha])



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5697) Use CliParser to parse options in RMAdminCLI

2016-11-01 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5697:
--
Attachment: YARN-5697.005-branch-2.8.patch

> Use CliParser to parse options in RMAdminCLI
> 
>
> Key: YARN-5697
> URL: https://issues.apache.org/jira/browse/YARN-5697
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
> Attachments: YARN-5697.001.patch, YARN-5697.002.patch, 
> YARN-5697.003.patch, YARN-5697.004.patch, YARN-5697.005-branch-2.8.patch, 
> YARN-5697.005.patch
>
>
> As discussed in YARN-4855, it is better to use CliParser rather than args to 
> parse command line options in RMAdminCli.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5552) Add Builder methods for common yarn API records

2016-11-01 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5552:
--
Attachment: YARN-5552.007.patch

> Add Builder methods for common yarn API records
> ---
>
> Key: YARN-5552
> URL: https://issues.apache.org/jira/browse/YARN-5552
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Tao Jie
> Attachments: YARN-5552.000.patch, YARN-5552.001.patch, 
> YARN-5552.002.patch, YARN-5552.003.patch, YARN-5552.004.patch, 
> YARN-5552.005.patch, YARN-5552.006.patch, YARN-5552.007.patch
>
>
> Currently yarn API records such as ResourceRequest, AllocateRequest/Respone 
> as well as AMRMClient.ContainerRequest have multiple constructors / 
> newInstance methods. This makes it very difficult to add new fields to these 
> records.
> It would probably be better if we had Builder classes for many of these 
> records, which would make evolution of these records a bit easier.
> (suggested by [~kasha])



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4997) Update fair scheduler to use pluggable auth provider

2016-11-01 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-4997:
--
Attachment: (was: YARN-4997-009.patch)

> Update fair scheduler to use pluggable auth provider
> 
>
> Key: YARN-4997
> URL: https://issues.apache.org/jira/browse/YARN-4997
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Tao Jie
> Attachments: YARN-4997-001.patch, YARN-4997-002.patch, 
> YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, 
> YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, 
> YARN-4997-009.patch
>
>
> Now that YARN-3100 has made the authorization pluggable, it should be 
> supported by the fair scheduler.  YARN-3100 only updated the capacity 
> scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4997) Update fair scheduler to use pluggable auth provider

2016-11-01 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-4997:
--
Attachment: YARN-4997-009.patch

> Update fair scheduler to use pluggable auth provider
> 
>
> Key: YARN-4997
> URL: https://issues.apache.org/jira/browse/YARN-4997
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Tao Jie
> Attachments: YARN-4997-001.patch, YARN-4997-002.patch, 
> YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, 
> YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, 
> YARN-4997-009.patch, YARN-4997-009.patch
>
>
> Now that YARN-3100 has made the authorization pluggable, it should be 
> supported by the fair scheduler.  YARN-3100 only updated the capacity 
> scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5552) Add Builder methods for common yarn API records

2016-11-01 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5552:
--
Attachment: YARN-5552.006.patch

> Add Builder methods for common yarn API records
> ---
>
> Key: YARN-5552
> URL: https://issues.apache.org/jira/browse/YARN-5552
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Tao Jie
> Attachments: YARN-5552.000.patch, YARN-5552.001.patch, 
> YARN-5552.002.patch, YARN-5552.003.patch, YARN-5552.004.patch, 
> YARN-5552.005.patch, YARN-5552.006.patch
>
>
> Currently yarn API records such as ResourceRequest, AllocateRequest/Respone 
> as well as AMRMClient.ContainerRequest have multiple constructors / 
> newInstance methods. This makes it very difficult to add new fields to these 
> records.
> It would probably be better if we had Builder classes for many of these 
> records, which would make evolution of these records a bit easier.
> (suggested by [~kasha])



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4997) Update fair scheduler to use pluggable auth provider

2016-11-01 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15624840#comment-15624840
 ] 

Tao Jie commented on YARN-4997:
---

Hi [~kasha], I rebased the patch for review.
For the semantics of {{setPermission}}, I checked code in RANGER, where 
{{RangerYarnAuthorizer}} extends {{YarnAuthorizationProvider}} and override 
method {{setPermission}}. As a result, we should keep this method as it used to 
be.

> Update fair scheduler to use pluggable auth provider
> 
>
> Key: YARN-4997
> URL: https://issues.apache.org/jira/browse/YARN-4997
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Tao Jie
> Attachments: YARN-4997-001.patch, YARN-4997-002.patch, 
> YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, 
> YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, 
> YARN-4997-009.patch
>
>
> Now that YARN-3100 has made the authorization pluggable, it should be 
> supported by the fair scheduler.  YARN-3100 only updated the capacity 
> scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4997) Update fair scheduler to use pluggable auth provider

2016-11-01 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-4997:
--
Attachment: YARN-4997-009.patch

> Update fair scheduler to use pluggable auth provider
> 
>
> Key: YARN-4997
> URL: https://issues.apache.org/jira/browse/YARN-4997
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Tao Jie
> Attachments: YARN-4997-001.patch, YARN-4997-002.patch, 
> YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, 
> YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, 
> YARN-4997-009.patch
>
>
> Now that YARN-3100 has made the authorization pluggable, it should be 
> supported by the fair scheduler.  YARN-3100 only updated the capacity 
> scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5697) Use CliParser to parse options in RMAdminCLI

2016-11-01 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5697:
--
Attachment: YARN-5697.005.patch

> Use CliParser to parse options in RMAdminCLI
> 
>
> Key: YARN-5697
> URL: https://issues.apache.org/jira/browse/YARN-5697
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
> Attachments: YARN-5697.001.patch, YARN-5697.002.patch, 
> YARN-5697.003.patch, YARN-5697.004.patch, YARN-5697.005.patch
>
>
> As discussed in YARN-4855, it is better to use CliParser rather than args to 
> parse command line options in RMAdminCli.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"

2016-11-01 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15624599#comment-15624599
 ] 

Tao Jie commented on YARN-5720:
---

Updated the document respect to discussion in YARN-5697, and changed the 
position of the option {{-failOnUnknownNodes}}

> Update document for "rmadmin -replaceLabelOnNode"
> -
>
> Key: YARN-5720
> URL: https://issues.apache.org/jira/browse/YARN-5720
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-5720.001.patch, YARN-5720.002.patch, 
> YarnCommands.png, nodeLabel.png
>
>
> As mentioned in YARN-4855, document should be updated since commands has 
> changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"

2016-11-01 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5720:
--
Attachment: YarnCommands.png

> Update document for "rmadmin -replaceLabelOnNode"
> -
>
> Key: YARN-5720
> URL: https://issues.apache.org/jira/browse/YARN-5720
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-5720.001.patch, YARN-5720.002.patch, 
> YarnCommands.png, nodeLabel.png
>
>
> As mentioned in YARN-4855, document should be updated since commands has 
> changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"

2016-11-01 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5720:
--
Attachment: nodeLabel.png

> Update document for "rmadmin -replaceLabelOnNode"
> -
>
> Key: YARN-5720
> URL: https://issues.apache.org/jira/browse/YARN-5720
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-5720.001.patch, YARN-5720.002.patch, nodeLabel.png
>
>
> As mentioned in YARN-4855, document should be updated since commands has 
> changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"

2016-11-01 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5720:
--
Attachment: (was: YarnCommands.png)

> Update document for "rmadmin -replaceLabelOnNode"
> -
>
> Key: YARN-5720
> URL: https://issues.apache.org/jira/browse/YARN-5720
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-5720.001.patch, YARN-5720.002.patch
>
>
> As mentioned in YARN-4855, document should be updated since commands has 
> changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"

2016-11-01 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5720:
--
Attachment: YARN-5720.002.patch

> Update document for "rmadmin -replaceLabelOnNode"
> -
>
> Key: YARN-5720
> URL: https://issues.apache.org/jira/browse/YARN-5720
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-5720.001.patch, YARN-5720.002.patch
>
>
> As mentioned in YARN-4855, document should be updated since commands has 
> changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"

2016-11-01 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5720:
--
Attachment: (was: nodeLabel.png)

> Update document for "rmadmin -replaceLabelOnNode"
> -
>
> Key: YARN-5720
> URL: https://issues.apache.org/jira/browse/YARN-5720
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-5720.001.patch, YARN-5720.002.patch
>
>
> As mentioned in YARN-4855, document should be updated since commands has 
> changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5697) Use CliParser to parse options in RMAdminCLI

2016-11-01 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5697:
--
Attachment: YARN-5697.004.patch

> Use CliParser to parse options in RMAdminCLI
> 
>
> Key: YARN-5697
> URL: https://issues.apache.org/jira/browse/YARN-5697
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
> Attachments: YARN-5697.001.patch, YARN-5697.002.patch, 
> YARN-5697.003.patch, YARN-5697.004.patch
>
>
> As discussed in YARN-4855, it is better to use CliParser rather than args to 
> parse command line options in RMAdminCli.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5697) Use CliParser to parse options in RMAdminCLI

2016-11-01 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5697:
--
Attachment: (was: YARN-5697.004.patch)

> Use CliParser to parse options in RMAdminCLI
> 
>
> Key: YARN-5697
> URL: https://issues.apache.org/jira/browse/YARN-5697
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
> Attachments: YARN-5697.001.patch, YARN-5697.002.patch, 
> YARN-5697.003.patch, YARN-5697.004.patch
>
>
> As discussed in YARN-4855, it is better to use CliParser rather than args to 
> parse command line options in RMAdminCli.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5697) Use CliParser to parse options in RMAdminCLI

2016-11-01 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15624544#comment-15624544
 ] 

Tao Jie commented on YARN-5697:
---

Updated this patch as discussed above.
And removing {{-directlyAccessNodeLabelStore}} is not included in this patch.
[~Naganarasimha], mind giving a review on this patch? 

> Use CliParser to parse options in RMAdminCLI
> 
>
> Key: YARN-5697
> URL: https://issues.apache.org/jira/browse/YARN-5697
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
> Attachments: YARN-5697.001.patch, YARN-5697.002.patch, 
> YARN-5697.003.patch, YARN-5697.004.patch
>
>
> As discussed in YARN-4855, it is better to use CliParser rather than args to 
> parse command line options in RMAdminCli.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5697) Use CliParser to parse options in RMAdminCLI

2016-11-01 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5697:
--
Attachment: YARN-5697.004.patch

> Use CliParser to parse options in RMAdminCLI
> 
>
> Key: YARN-5697
> URL: https://issues.apache.org/jira/browse/YARN-5697
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
> Attachments: YARN-5697.001.patch, YARN-5697.002.patch, 
> YARN-5697.003.patch, YARN-5697.004.patch
>
>
> As discussed in YARN-4855, it is better to use CliParser rather than args to 
> parse command line options in RMAdminCli.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5552) Add Builder methods for common yarn API records

2016-10-14 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15575292#comment-15575292
 ] 

Tao Jie commented on YARN-5552:
---

Updated the patch according to [~leftnoteasy]'s suggestion.
[~asuresh], [~kasha], [~leftnoteasy] would you mind reviewing the latest patch 
again?

> Add Builder methods for common yarn API records
> ---
>
> Key: YARN-5552
> URL: https://issues.apache.org/jira/browse/YARN-5552
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Tao Jie
> Attachments: YARN-5552.000.patch, YARN-5552.001.patch, 
> YARN-5552.002.patch, YARN-5552.003.patch, YARN-5552.004.patch, 
> YARN-5552.005.patch
>
>
> Currently yarn API records such as ResourceRequest, AllocateRequest/Respone 
> as well as AMRMClient.ContainerRequest have multiple constructors / 
> newInstance methods. This makes it very difficult to add new fields to these 
> records.
> It would probably be better if we had Builder classes for many of these 
> records, which would make evolution of these records a bit easier.
> (suggested by [~kasha])



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5552) Add Builder methods for common yarn API records

2016-10-14 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5552:
--
Attachment: YARN-5552.005.patch

> Add Builder methods for common yarn API records
> ---
>
> Key: YARN-5552
> URL: https://issues.apache.org/jira/browse/YARN-5552
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Tao Jie
> Attachments: YARN-5552.000.patch, YARN-5552.001.patch, 
> YARN-5552.002.patch, YARN-5552.003.patch, YARN-5552.004.patch, 
> YARN-5552.005.patch
>
>
> Currently yarn API records such as ResourceRequest, AllocateRequest/Respone 
> as well as AMRMClient.ContainerRequest have multiple constructors / 
> newInstance methods. This makes it very difficult to add new fields to these 
> records.
> It would probably be better if we had Builder classes for many of these 
> records, which would make evolution of these records a bit easier.
> (suggested by [~kasha])



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5552) Add Builder methods for common yarn API records

2016-10-14 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5552:
--
Attachment: YARN-5552.004.patch

> Add Builder methods for common yarn API records
> ---
>
> Key: YARN-5552
> URL: https://issues.apache.org/jira/browse/YARN-5552
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Tao Jie
> Attachments: YARN-5552.000.patch, YARN-5552.001.patch, 
> YARN-5552.002.patch, YARN-5552.003.patch, YARN-5552.004.patch
>
>
> Currently yarn API records such as ResourceRequest, AllocateRequest/Respone 
> as well as AMRMClient.ContainerRequest have multiple constructors / 
> newInstance methods. This makes it very difficult to add new fields to these 
> records.
> It would probably be better if we had Builder classes for many of these 
> records, which would make evolution of these records a bit easier.
> (suggested by [~kasha])



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"

2016-10-12 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570652#comment-15570652
 ] 

Tao Jie commented on YARN-5720:
---

Attached picture of generated html, which is easier to review.

> Update document for "rmadmin -replaceLabelOnNode"
> -
>
> Key: YARN-5720
> URL: https://issues.apache.org/jira/browse/YARN-5720
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-5720.001.patch, YarnCommands.png, nodeLabel.png
>
>
> As mentioned in YARN-4855, document should be updated since commands has 
> changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"

2016-10-12 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5720:
--
Attachment: YarnCommands.png
nodeLabel.png

> Update document for "rmadmin -replaceLabelOnNode"
> -
>
> Key: YARN-5720
> URL: https://issues.apache.org/jira/browse/YARN-5720
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-5720.001.patch, YarnCommands.png, nodeLabel.png
>
>
> As mentioned in YARN-4855, document should be updated since commands has 
> changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"

2016-10-12 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5720:
--
Attachment: YarnCommands.html
NodeLabel.html
YARN-5720.001.patch

> Update document for "rmadmin -replaceLabelOnNode"
> -
>
> Key: YARN-5720
> URL: https://issues.apache.org/jira/browse/YARN-5720
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-5720.001.patch
>
>
> As mentioned in YARN-4855, document should be updated since commands has 
> changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"

2016-10-12 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5720:
--
Attachment: (was: YARN-5720.001.patch)

> Update document for "rmadmin -replaceLabelOnNode"
> -
>
> Key: YARN-5720
> URL: https://issues.apache.org/jira/browse/YARN-5720
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-5720.001.patch
>
>
> As mentioned in YARN-4855, document should be updated since commands has 
> changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"

2016-10-12 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5720:
--
Attachment: (was: YarnCommands.html)

> Update document for "rmadmin -replaceLabelOnNode"
> -
>
> Key: YARN-5720
> URL: https://issues.apache.org/jira/browse/YARN-5720
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-5720.001.patch
>
>
> As mentioned in YARN-4855, document should be updated since commands has 
> changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"

2016-10-12 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5720:
--
Attachment: (was: NodeLabel.html)

> Update document for "rmadmin -replaceLabelOnNode"
> -
>
> Key: YARN-5720
> URL: https://issues.apache.org/jira/browse/YARN-5720
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-5720.001.patch
>
>
> As mentioned in YARN-4855, document should be updated since commands has 
> changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5697) Use CliParser to parse options in RMAdminCLI

2016-10-12 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570580#comment-15570580
 ] 

Tao Jie commented on YARN-5697:
---

Thank you [~Naganarasimha],
I tried more ideal logic in earlier patch, but failed in 
testcase:TestRMAdminCLI#directlyAccessNodeLabelStore:
{code}
// change the sequence of "-directlyAccessNodeLabelStore" and labels,
// should not matter
args =
new String[] { "-addToClusterNodeLabels",
"-directlyAccessNodeLabelStore", "x,y" };
assertEquals(0, rmAdminCLI.run(args));
assertTrue(dummyNodeLabelsManager.getClusterNodeLabelNames().containsAll(
ImmutableSet.of("x", "y")));
{code}
It seems that we don't care about the position of 
{{-directlyAccessNodeLabelStore}} in command line currently.
Although {{-directlyAccessNodeLabelStore}} is marked as deprecated, this option 
still leads to different code path currently:
{code}
if (directlyAccessNodeLabelStore) {
  getNodeLabelManagerInstance(getConf()).replaceLabelsOnNode(map);
} else {
  ResourceManagerAdministrationProtocol adminProtocol =
  createAdminProtocol();
  ReplaceLabelsOnNodeRequest request =
  ReplaceLabelsOnNodeRequest.newInstance(map);
  request.setFailOnUnknownNodes(failOnUnknownNodes);
  adminProtocol.replaceLabelsOnNode(request);
}
{code}
Should we just remove the logic about {{-directlyAccessNodeLabelStore}} in this 
patch?
To make it clear,
1,  We should restrict command line format ({{rmadmin -addToClusterNodeLabels 
-directlyAccessNodeLabelStore x,y}} will no longer be OK, also {{rmadmin 
-replaceLabelsOnNode -failOnUnknownNodes node1=label1}} should be {{rmadmin 
-replaceLabelsOnNode node1=label1 -failOnUnknownNodes}}).
2, We should remove code about  {{-directlyAccessNodeLabelStore}} in this patch.
3, We should modify document and remove  {{-directlyAccessNodeLabelStore}}.
Agree?
 

> Use CliParser to parse options in RMAdminCLI
> 
>
> Key: YARN-5697
> URL: https://issues.apache.org/jira/browse/YARN-5697
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
> Fix For: 2.8.0
>
> Attachments: YARN-5697.001.patch, YARN-5697.002.patch, 
> YARN-5697.003.patch
>
>
> As discussed in YARN-4855, it is better to use CliParser rather than args to 
> parse command line options in RMAdminCli.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5697) Use CliParser to parse options in RMAdminCLI

2016-10-11 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565416#comment-15565416
 ] 

Tao Jie commented on YARN-5697:
---

Thank you [~Naganarasimha] for your comments.
As I mentioned before, I tried {{cliParser.getOptionValues}} in the earlier 
patch but found incompatibility. Both {{rmadmin -replaceLabelsOnNode  
node1=label1 -directlyAccessNodeLabelStore}} and {{rmadmin -replaceLabelsOnNode 
-directlyAccessNodeLabelStore  node1=label1}} go well in existing logic. When I 
use {{cliParser.getOptionValues}} to parse the latter command, {{node1=label1}} 
is parsed as optionValue of {{-directlyAccessNodeLabelStore}} rather than 
{{-replaceLabelsOnNode}}. Actually {{-directlyAccessNodeLabelStore}} is valid 
in any position. As a result I use {{cliParser.getArgs()}}, which ignores args 
sequence but keep compatible with existing logic.

> Use CliParser to parse options in RMAdminCLI
> 
>
> Key: YARN-5697
> URL: https://issues.apache.org/jira/browse/YARN-5697
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
> Fix For: 2.8.0
>
> Attachments: YARN-5697.001.patch, YARN-5697.002.patch, 
> YARN-5697.003.patch
>
>
> As discussed in YARN-4855, it is better to use CliParser rather than args to 
> parse command line options in RMAdminCli.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"

2016-10-11 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5720:
--
Attachment: YARN-5720.001.patch

> Update document for "rmadmin -replaceLabelOnNode"
> -
>
> Key: YARN-5720
> URL: https://issues.apache.org/jira/browse/YARN-5720
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-5720.001.patch
>
>
> As mentioned in YARN-4855, document should be updated since commands has 
> changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"

2016-10-11 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5720:
--
Affects Version/s: 2.8.0

> Update document for "rmadmin -replaceLabelOnNode"
> -
>
> Key: YARN-5720
> URL: https://issues.apache.org/jira/browse/YARN-5720
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-5720.001.patch
>
>
> As mentioned in YARN-4855, document should be updated since commands has 
> changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5720) Update document for "rmadmin -replaceLabelOnNode"

2016-10-11 Thread Tao Jie (JIRA)
Tao Jie created YARN-5720:
-

 Summary: Update document for "rmadmin -replaceLabelOnNode"
 Key: YARN-5720
 URL: https://issues.apache.org/jira/browse/YARN-5720
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Tao Jie
Assignee: Tao Jie
Priority: Minor


As mentioned in YARN-4855, document should be updated since commands has 
changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



  1   2   3   >