xintongsong commented on issue #9760: [FLINK-13982][runtime] Implement memory 
calculation logics
URL: https://github.com/apache/flink/pull/9760#issuecomment-536277862
 
 
   Thanks for the review, @tillrohrmann.
   It seems that the two main concerns are explicit configured values and 
backwards compatibility.
   
   Regarding explicit configured values, here are what I think:
   - We should always respect explicit configured values
   - We should also always respect the default absolute values of the following 
config options, if not explicitly configured
     - Framework Heap Memory
     - Task Off-Heap Memory
     - Min/Max Shuffle Memory
     - JVM Metaspace
     - Min/Max JVM Overhead
   - If Task Heap Memory and Managed Memory are not both explicitly configured, 
and Total Flink Memory is either explicitly configured or can be derived from 
explicitly configured Total ProcessMemory
     - If none of Task Heap Memory and Managed Memory is configured, or only 
Managed Memory is configured but not Task Heap Memory, we follow the current 
calculation logic.
     - If only Task Heap Memory is configured, but not Managed Memory
       - If Managed Memory fraction is explicitly configured, we first derive 
Managed Memory size from the fraction, then leave the remaining (Total Flink 
Memory minus Framework Heap Memory, Task Heap/Off-Heap Memory and Managed 
Memory) to Shuffle Memory. If the derived Shuffle Memory size is not in the 
(explicitly configured or default) Min/Max Shuffle Memory, we fail.
       - If Managed Memory fraction is not configured, we first derive Shuffle 
Memory (from min/max/fraction), then leave the remaining to Managed Memory.
   
   Regarding backwards compatibility, the following config options are not 
supported in the current PR:
   - `taskmanager.memory.preallocate`: Pre-allocation is no longer supported.
   - `taskmanager.network.numberOfBuffers`: I talked to @zhijiangW about this. 
I think we can still support this config option as a fallback. This option will 
only take effect when 1) it is explicitly configured and 2) none of 
`taskmanager.memory.shuffle.min/max/fraction` or 
`taskmanager.network.memory.min/max/fraction` is configured. When it takes 
effect, we consider Shuffle Memory Size is explicitly configured by user, and 
derive it from page size * buffer num.
   - `taskmanager.memory.fraction` (the legacy managed memory fraction): I'm in 
favor of removing this config option, for the following reasons.
     - The definition of the legacy fraction and new fraction are different. 
Keeping both of them can be confusing. 
       - legacy fraction = managed memory / (total flink memory - shuffle 
memory)
       - new fraction = managed memory / total flink memory
     - The calculation can be complicated. Computing managed memory size from 
legacy fraction depends on the shuffle memory size, while sometimes shuffle 
memory size also depends on the managed memory size. Even we derive managed 
memory and network memory together, from total flink memory and the sum of 
managed memory fraction and network memory fraction, shuffle memory can still 
be overwritten by its min/max limit, causing the actual fraction that managed 
memory takes changes.
   
   What do you think? I'd like to hear your feedbacks on above two issues 
before updating the PR. Thank you.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to