[jira] [Commented] (YARN-6710) There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair scheduler not assign container to the queue

2017-06-20 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16056662#comment-16056662
 ] 

Yufei Gu commented on YARN-6710:


Thanks [~daemon] for the detailed information.

Basically you are saying the latency of handling APP_ATTEMPT_REMOVED cause some 
issues: 1) the amResourceUsage issue which has been fixed in YARN-3415, 2) RM 
shouldn't assign any container to the application if its appAttempt has 
finished and there are still resource requests. Issue 2 seems a legitimate 
issue. For me, it is more a design issue in AM(Mapreduce, Spark) instead of RM 
than an RM issue. I am not sure how the scheduler check the status of 
application attempt for that situation. If the scheduler already know app 
attempt has finished, it shouldn't assign any resources to it at all. Better to 
check if that part is already here before we move on. 

> There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair 
> scheduler not assign container to the queue
> ---
>
> Key: YARN-6710
> URL: https://issues.apache.org/jira/browse/YARN-6710
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: daemon
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png, screenshot-5.png
>
>
> There are over three thousand nodes in my hadoop production cluster, and we 
> use fair schedule as my scheduler.
> Though there are many free resource in my resource manager, but there are 46 
> applications pending. 
> Those applications can not run after  several hours, and in the end I have to 
> stop them.
> I reproduce the scene in my test environment, and I find a bug in 
> FSLeafQueue. 
> In a extreme scenario it will let the FSLeafQueue#amResourceUsage greater 
> than itself.
> When fair scheduler try to assign container to a application attempt,  it 
> will do as follow check:
> !screenshot-2.png!
> !screenshot-3.png!
> Because the value of FSLeafQueue#amResourceUsage is invalid, it will greater 
> then it real value.
> So when the value of amResourceUsage greater than the value of 
> Resources.multiply(getFairShare(), maxAMShare) ,
> and the FSLeafQueue#canRunAppAM function will return false which will let the 
> fair scheduler not assign container
> to the FSAppAttempt. 
> In this scenario, all the application attempt will pending and never get any 
> resource.
> I find the reason why so many applications in my leaf queue is pending. I 
> will describe it as follow:
> When fair scheduler first assign a container to the application attempt, it 
> will do something as blow:
> !screenshot-4.png!
> When fair scheduler remove the application attempt from the leaf queue, it 
> will do something as blow:
> !screenshot-5.png!
> But when application attempt unregister itself, and all the container in the 
> SchedulerApplicationAttempt#liveContainers 
> are complete.  There is a APP_ATTEMPT_REMOVED event will send to fair 
> scheduler, but it is asynchronous.
> Before the application attempt is removed from FSLeafQueue, and there are 
> pending request in FSAppAttempt.
> The fair scheduler will assign container to the FSAppAttempt, because the 
> size of the liveContainers will equals to
> 1. 
> So the FSLeafQueue will add to container resource to the 
> FSLeafQueue#amResourceUsage,  it will
> let the value of amResourceUsage greater then itself. 
> In the end, the value of FSLeafQueue#amResourceUsage is preety large although 
> there is no application
> it the queue.
> When new application come, and the value of FSLeafQueue#amResourceUsage  
> greater than the value
> of Resources.multiply(getFairShare(), maxAMShare), it will let the scheduler 
> never assign container to
> the queue.
> All of the applications in the queue will always pending.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6710) There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair scheduler not assign container to the queue

2017-06-19 Thread daemon (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16053715#comment-16053715
 ] 

daemon commented on YARN-6710:
--

[~yufeigu]  使用中文表述可能更清楚些, 这个问题导致的原因对于YARN端主要是由于
1. Application attempt运行完成之后,AM 向RM发送unregisterApplicationMaster RPC请求。RM在处理
这个消息时,做些简单的处理然后就向FairScheduler发送APP_ATTEMPT_REMOVED消息就返回了。 
而APP_ATTEMPT_REMOVED的处理是异步的,所以在FairScheduler中,对应的FSAppAttempt会过段时间
才会被remove掉。

这个问题会导致两个比较严重的后果发生:
1. 在这个时间间隔,FairScheduler还会给FSAppAttempt 分派Container。 并且会在分派Container的时候,如果
if (getLiveContainers().size() == 1 && !getUnmanagedAM()) 情况满足的话,会继续累加am 
resource的值到amResourceUsage,使得amResourceUsage的值比实际的值大很多。 在实际的情况中,可能会导致队列中的
作业一直pending,并且永远得不到资源, 这个就是我在上面描述的情况。  
对于amResourceUsage统计的值比实际大很多问题,社区已经有patch fix这个问题了。 具体可以查看这个jira:
https://issues.apache.org/jira/browse/YARN-3415。

2. 导致FairScheduler会给已经Finished的Application attempt分派Container, 
虽然对应的Container,在NM汇报
心跳的时候,RM会给NM发送Response,让对应的NM cleanup它。 但是会造成资源的浪费。 并且目前调度速度那么快,
这种问题会更加明显。

虽然社区版本中已经解决了amResourceUsage的问题,但我觉得它只是解决了问题域中的一部分。 
上述的问题2也是急需要解决的问题。 虽然我看到YARN-3415对应的也解决了Spark框架中unreigster application 
attempt之前把对应的pending的申请资源申请都清空了。 
但是YARN作为一个通用的资源分派框架是需要Cover这些所有可能遇到的情况。对于一个通用的资源分派框架,我们不能限定用户的使用方式。
不能依赖用户每次unregister application master的时候,会在之前释放所有pending的request。

所以,我们需要在分派container之前就要做对应的判断,这个是急需解决的问题。麻烦yufei根据我所说的,再
评估下这个问题有没有需要解决。

谢谢,


> There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair 
> scheduler not assign container to the queue
> ---
>
> Key: YARN-6710
> URL: https://issues.apache.org/jira/browse/YARN-6710
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: daemon
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png, screenshot-5.png
>
>
> There are over three thousand nodes in my hadoop production cluster, and we 
> use fair schedule as my scheduler.
> Though there are many free resource in my resource manager, but there are 46 
> applications pending. 
> Those applications can not run after  several hours, and in the end I have to 
> stop them.
> I reproduce the scene in my test environment, and I find a bug in 
> FSLeafQueue. 
> In a extreme scenario it will let the FSLeafQueue#amResourceUsage greater 
> than itself.
> When fair scheduler try to assign container to a application attempt,  it 
> will do as follow check:
> !screenshot-2.png!
> !screenshot-3.png!
> Because the value of FSLeafQueue#amResourceUsage is invalid, it will greater 
> then it real value.
> So when the value of amResourceUsage greater than the value of 
> Resources.multiply(getFairShare(), maxAMShare) ,
> and the FSLeafQueue#canRunAppAM function will return false which will let the 
> fair scheduler not assign container
> to the FSAppAttempt. 
> In this scenario, all the application attempt will pending and never get any 
> resource.
> I find the reason why so many applications in my leaf queue is pending. I 
> will describe it as follow:
> When fair scheduler first assign a container to the application attempt, it 
> will do something as blow:
> !screenshot-4.png!
> When fair scheduler remove the application attempt from the leaf queue, it 
> will do something as blow:
> !screenshot-5.png!
> But when application attempt unregister itself, and all the container in the 
> SchedulerApplicationAttempt#liveContainers 
> are complete.  There is a APP_ATTEMPT_REMOVED event will send to fair 
> scheduler, but it is asynchronous.
> Before the application attempt is removed from FSLeafQueue, and there are 
> pending request in FSAppAttempt.
> The fair scheduler will assign container to the FSAppAttempt, because the 
> size of the liveContainers will equals to
> 1. 
> So the FSLeafQueue will add to container resource to the 
> FSLeafQueue#amResourceUsage,  it will
> let the value of amResourceUsage greater then itself. 
> In the end, the value of FSLeafQueue#amResourceUsage is preety large although 
> there is no application
> it the queue.
> When new application come, and the value of FSLeafQueue#amResourceUsage  
> greater than the value
> of Resources.multiply(getFairShare(), maxAMShare), it will let the scheduler 
> never assign container to
> the queue.
> All of the applications in the queue will always pending.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6710) There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair scheduler not assign container to the queue

2017-06-14 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049393#comment-16049393
 ] 

Yufei Gu commented on YARN-6710:


[~daemon], IIUC, the root cause is APP_ATTEMPT_REMOVED event doesn't unregister 
AM in time so that delay of AM resource usage decreasing blocks new 
applications. Is it? 

> There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair 
> scheduler not assign container to the queue
> ---
>
> Key: YARN-6710
> URL: https://issues.apache.org/jira/browse/YARN-6710
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: daemon
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png, screenshot-5.png
>
>
> There are over three thousand nodes in my hadoop production cluster, and we 
> use fair schedule as my scheduler.
> Though there are many free resource in my resource manager, but there are 46 
> applications pending. 
> Those applications can not run after  several hours, and in the end I have to 
> stop them.
> I reproduce the scene in my test environment, and I find a bug in 
> FSLeafQueue. 
> In a extreme scenario it will let the FSLeafQueue#amResourceUsage greater 
> than itself.
> When fair scheduler try to assign container to a application attempt,  it 
> will do as follow check:
> !screenshot-2.png!
> !screenshot-3.png!
> Because the value of FSLeafQueue#amResourceUsage is invalid, it will greater 
> then it real value.
> So when the value of amResourceUsage greater than the value of 
> Resources.multiply(getFairShare(), maxAMShare) ,
> and the FSLeafQueue#canRunAppAM function will return false which will let the 
> fair scheduler not assign container
> to the FSAppAttempt. 
> In this scenario, all the application attempt will pending and never get any 
> resource.
> I find the reason why so many applications in my leaf queue is pending. I 
> will describe it as follow:
> When fair scheduler first assign a container to the application attempt, it 
> will do something as blow:
> !screenshot-4.png!
> When fair scheduler remove the application attempt from the leaf queue, it 
> will do something as blow:
> !screenshot-5.png!
> But when application attempt unregister itself, and all the container in the 
> SchedulerApplicationAttempt#liveContainers 
> are complete.  There is a APP_ATTEMPT_REMOVED event will send to fair 
> scheduler, but it is asynchronous.
> Before the application attempt is removed from FSLeafQueue, and there are 
> pending request in FSAppAttempt.
> The fair scheduler will assign container to the FSAppAttempt, because the 
> size of the liveContainers will equals to
> 1. 
> So the FSLeafQueue will add to container resource to the 
> FSLeafQueue#amResourceUsage,  it will
> let the value of amResourceUsage greater then itself. 
> In the end, the value of FSLeafQueue#amResourceUsage is preety large although 
> there is no application
> it the queue.
> When new application come, and the value of FSLeafQueue#amResourceUsage  
> greater than the value
> of Resources.multiply(getFairShare(), maxAMShare), it will let the scheduler 
> never assign container to
> the queue.
> All of the applications in the queue will always pending.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6710) There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair scheduler not assign container to the queue

2017-06-13 Thread daemon (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048658#comment-16048658
 ] 

daemon commented on YARN-6710:
--

[~dan...@cloudera.com] I am sorry, I am try to express myself. But my english 
is so poor, so it is very slow
for me to express myself.

> There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair 
> scheduler not assign container to the queue
> ---
>
> Key: YARN-6710
> URL: https://issues.apache.org/jira/browse/YARN-6710
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: daemon
> Fix For: 2.8.0
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png
>
>
> There are over three thousand nodes in my hadoop production cluster, and we 
> use fair schedule as my scheduler.
> Though there are many free resource in my resource manager, but there are 46 
> applications pending. 
> Those applications can not run after  several hours, and in the end I have to 
> stop them.
> I reproduce the scene in my test environment, and I find a bug in 
> FSLeafQueue. 
> In a extreme scenario it will let the FSLeafQueue#amResourceUsage greater 
> than itself.
> When fair scheduler try to assign container to a application attempt,  it 
> will do as follow check:
> !screenshot-2.png!
> !screenshot-3.png!
> Because the value of FSLeafQueue#amResourceUsage is invalid, it will greater 
> then it real value.
> So when the value of amResourceUsage greater than the value of 
> Resources.multiply(getFairShare(), maxAMShare) ,
> and the FSLeafQueue#canRunAppAM function will return false which will let the 
> fair scheduler not assign container
> to the FSAppAttempt. 
> In this scenario, all the application attempt will pending and never get any 
> resource.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6710) There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair scheduler not assign container to the queue

2017-06-13 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048642#comment-16048642
 ] 

Daniel Templeton commented on YARN-6710:


Can you give us more details?  Looking at the screenshot, I see 3600 completed 
apps, which doesn't tell me much.

> There is a heavy bug in FSLeafQueue#amResourceUsage which will let the fair 
> scheduler not assign container to the queue
> ---
>
> Key: YARN-6710
> URL: https://issues.apache.org/jira/browse/YARN-6710
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: daemon
> Fix For: 2.8.0
>
> Attachments: screenshot-1.png
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org