[ 
https://issues.apache.org/jira/browse/FLINK-15948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17035181#comment-17035181
 ] 

Xintong Song edited comment on FLINK-15948 at 2/13/20 2:07 AM:
---------------------------------------------------------------

1. I just checked on this. -Turns out we already have such checking and warning 
logs on client side.- We have checking on whether process memory is smaller 
than yarn min allocation, but not on whether it's an exact division. See 
{{YarnClusterDescriptor#validateClusterResources}}.

2. AFAIK, it is proposed in 
[FLIP-75|https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing]
 to add metrics for task executor total process memory size, which should be 
the container size on Kubernetes. It seems not necessary to add another metric 
only for Yarn where the container size could be larger than the process memory 
size, especially when we already have the warning log. Not sure whether 
container size is guaranteed to be exactly as much as requested on Mesos though.


was (Author: xintongsong):
1. I just checked on this. Turns out we already have such checking and warning 
logs on client side. See {{ YarnClusterDescriptor#validateClusterResources }}.

2. AFAIK, it is proposed in 
[FLIP-75|https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing]
 to add metrics for task executor total process memory size, which should be 
the container size on Kubernetes. It seems not necessary to add another metric 
only for Yarn where the container size could be larger than the process memory 
size, especially when we already have the warning log. Not sure whether 
container size is guaranteed to be exactly as much as requested on Mesos though.

> Resource will be wasted when the task manager memory is not a multiple of 
> Yarn minimum allocation
> -------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-15948
>                 URL: https://issues.apache.org/jira/browse/FLINK-15948
>             Project: Flink
>          Issue Type: Bug
>          Components: Deployment / YARN
>    Affects Versions: 1.10.0
>            Reporter: Yang Wang
>            Priority: Major
>
> If the {{taskmanager.memory.process.size}} is set to 2000m and the Yarn 
> minimum allocation is 128m, we will get a container with 2048m. Currently, 
> {{TaskExecutorProcessSpec}} is built with 2000m, so we will have 48m wasted 
> and they could not be used by Flink.
> I think Flink has accounted all the jvm heap, off-heap, overhead resources. 
> So we should not leave these free memory there. And i suggest to update the 
> {{TaskExecutorProcessSpec}} according to the Yarn allocated container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to