Re: [Discuss] Tuning FLIP-49 configuration default values.

Andrey Zagrebin Tue, 14 Jan 2020 06:48:06 -0800

Hi all!

Great that we have already tried out new FLIP-49 with the bigger jobs.


I am also +1 for the JVM metaspace and overhead changes.

Regarding 0.3 vs 0.4 for managed memory, +1 for having more managed memory
for Rocksdb limiting case.

In general, this looks mostly to be about memory distribution between JVM
heap and managed off-heap.
Comparing to the previous default setup, the JVM heap dropped (especially
for standalone) mostly due to moving managed from heap to off-heap and then
also adding framework off-heap.
In general, this can be the most important consequence for beginners and
those who rely on the default configuration.
Especially the legacy default configuration in standalone with falling back
heap.size to flink.size but there it seems we cannot do too much now.

I prepared a spreadsheet
<https://docs.google.com/spreadsheets/d/1mJaMkMPfDJJ-w6nMXALYmTc4XxiV30P5U7DzgwLkSoE>
to play with numbers for the mentioned in the report setups.

One idea would be to set process size (or smaller flink size respectively)
to a bigger default number, like 2048.
In this case, the abs derived default JVM heap and managed memory are close
to the previous defaults, especially for managed fraction 0.3.
This should align the defaults with the previous standalone try-out
experience where the increased off-heap memory is not strictly controlled
by the environment anyways.
The consequence for container users who relied on and updated the default
configuration is that the containers will be requested with the double size.

Best,
Andrey


On Tue, Jan 14, 2020 at 11:20 AM Till Rohrmann <[email protected]> wrote:

> +1 for the JVM metaspace and overhead changes.
>
> On Tue, Jan 14, 2020 at 11:19 AM Till Rohrmann <[email protected]>
> wrote:
>
>> I guess one of the most important results of this experiment is to have a
>> good tuning guide available for users who are past the initial try-out
>> phase because the default settings will be kind of a compromise. I assume
>> that this is part of the outstanding FLIP-49 documentation task.
>>
>> If we limit RocksDB's memory consumption by default, then I believe that
>> 0.4 would give the better all-round experience as it leaves a bit more
>> memory for RocksDB. However, I'm a bit sceptical whether we should optimize
>> the default settings for a configuration where the user still needs to
>> activate the strict memory limiting for RocksDB. In this case, I would
>> expect that the user could also adapt the managed memory fraction.
>>
>> Cheers,
>> Till
>>
>> On Tue, Jan 14, 2020 at 3:39 AM Xintong Song <[email protected]>
>> wrote:
>>
>>> Thanks for the feedback, Stephan and Kurt.
>>>
>>> @Stephan
>>>
>>> Regarding managed memory fraction,
>>> - It makes sense to keep the default value 0.4, if we assume rocksdb
>>> memory is limited by default.
>>> - AFAIK, currently rocksdb by default does not limit its memory usage.
>>> And I'm positive to change it.
>>> - Personally, I don't like the idea that we the out-of-box experience
>>> (for which we set the default fraction) relies on that users will manually
>>> turn another switch on.
>>>
>>> Regarding framework heap memory,
>>> - The major reason we set it by default is, as you mentioned, that to
>>> have a safe net of minimal JVM heap size.
>>> - Also, considering the in progress FLIP-56 (dynamic slot allocation),
>>> we want to reserve some heap memory that will not go into the slot
>>> profiles. That's why we decide the default value according to the heap
>>> memory usage of an empty task executor.
>>>
>>> @Kurt
>>> Regarding metaspace,
>>> - This config option ("taskmanager.memory.jvm-metaspace") only takes
>>> effect on TMs. Currently we do not set metaspace size for JM.
>>> - If we have the same metaspace problem on TMs, then yes, changing it
>>> from 128M to 64M will make it worse. However, IMO 10T tpc-ds benchmark
>>> should not be considered as out-of-box experience and it makes sense to
>>> tune the configurations for it. I think the smaller metaspace size would be
>>> a better choice for the first trying-out, where a job should not be too
>>> complicated, the TM size could be relative small (e.g. 1g).
>>>
>>> Thank you~
>>>
>>> Xintong Song
>>>
>>>
>>>
>>> On Tue, Jan 14, 2020 at 9:38 AM Kurt Young <[email protected]> wrote:
>>>
>>>> HI Xingtong,
>>>>
>>>> IIRC during our tpc-ds 10T benchmark, we have suffered by JM's
>>>> metaspace size and full gc which
>>>> caused by lots of classloadings of source input split. Could you check
>>>> whether changing the default
>>>> value from 128MB to 64MB will make it worse?
>>>>
>>>> Correct me if I misunderstood anything, also cc @Jingsong
>>>>
>>>> Best,
>>>> Kurt
>>>>
>>>>
>>>> On Tue, Jan 14, 2020 at 3:44 AM Stephan Ewen <[email protected]> wrote:
>>>>
>>>>> Hi all!
>>>>>
>>>>> Thanks a lot, Xintong, for this thorough analysis. Based on your
>>>>> analysis,
>>>>> here are some thoughts:
>>>>>
>>>>> +1 to change default JVM metaspace size from 128MB to 64MB
>>>>> +1 to change default JVM overhead min size from 128MB to 196MB
>>>>>
>>>>> Concerning the managed memory fraction, I am not sure I would change
>>>>> it,
>>>>> for the following reasons:
>>>>>
>>>>>   - We should assume RocksDB will be limited to managed memory by
>>>>> default.
>>>>> This will either be active by default or we would encourage everyone
>>>>> to use
>>>>> this by default, because otherwise it is super hard to reason about the
>>>>> RocksDB footprint.
>>>>>   - For standalone, a managed memory fraction of 0.3 is less than half
>>>>> of
>>>>> the managed memory from 1.9.
>>>>>   - I am not sure if the managed memory fraction is a value that all
>>>>> users
>>>>> adjust immediately when scaling up the memory during their first
>>>>> try-out
>>>>> phase. I would assume that most users initially only adjust
>>>>> "memory.flink.size" or "memory.process.size". A value of 0.3 will lead
>>>>> to
>>>>> having too large heaps and very little RocksDB / batch memory even when
>>>>> scaling up during the initial exploration.
>>>>>   - I agree, though, that 0.5 looks too aggressive, from your
>>>>> benchmarks.
>>>>> So maybe keeping it at 0.4 could work?
>>>>>
>>>>> And one question: Why do we set the Framework Heap by default? Is that
>>>>> so
>>>>> we reduce the managed memory further is less than framework heap would
>>>>> be
>>>>> left from the JVM heap?
>>>>>
>>>>> Best,
>>>>> Stephan
>>>>>
>>>>> On Thu, Jan 9, 2020 at 10:54 AM Xintong Song <[email protected]>
>>>>> wrote:
>>>>>
>>>>> > Hi all,
>>>>> >
>>>>> > As described in FLINK-15145 [1], we decided to tune the default
>>>>> > configuration values of FLIP-49 with more jobs and cases.
>>>>> >
>>>>> > After spending time analyzing and tuning the configurations, I've
>>>>> come
>>>>> > with several findings. To be brief, I would suggest the following
>>>>> changes,
>>>>> > and for more details please take a look at my tuning report [2].
>>>>> >
>>>>> >    - Change default managed memory fraction from 0.4 to 0.3.
>>>>> >    - Change default JVM metaspace size from 128MB to 64MB.
>>>>> >    - Change default JVM overhead min size from 128MB to 196MB.
>>>>> >
>>>>> > Looking forward to your feedback.
>>>>> >
>>>>> > Thank you~
>>>>> >
>>>>> > Xintong Song
>>>>> >
>>>>> >
>>>>> > [1] https://issues.apache.org/jira/browse/FLINK-15145
>>>>> >
>>>>> > [2]
>>>>> >
>>>>> https://docs.google.com/document/d/1-LravhQYUIkXb7rh0XnBB78vSvhp3ecLSAgsiabfVkk/edit?usp=sharing
>>>>> >
>>>>> >
>>>>>
>>>>

Re: [Discuss] Tuning FLIP-49 configuration default values.

Reply via email to