Clearing the `flink.size` option and setting `process.size` could indeed be
a solution. The thing I'm wondering is what would happen if the user has
configured `task.heap.size` and `managed.size` instead of `flink.size`?
Would we also ignore these settings? If not, then we risk to run into the
situation that TaskExecutorResourceUtils fails because the memory does not
add up. I guess the thing which bugs me a bit is the special casing which
could lead easily into inconsistent behaviour if we don't cover all cases.
The consequence of using slightly different concepts (`flink.size`,
`process.size`) in standalone vs. container/Yarn/Mesos mode in order to
keep the status quo is an increased maintenance overhead which we should be
aware of.

Cheers,
Till

On Tue, Jan 14, 2020 at 3:59 AM Xintong Song <tonysong...@gmail.com> wrote:

> True, even we have "process.size" rather than "flink.size" in the default
> config file, user can still have "flink.size" in their custom config file.
> If we consider "-tm" as a shortcut for users to override the TM memory
> size, then it makes less sense to require users to remove "flink.size" from
> their config file whenever then want to use "-tm".
>
> I'm convinced that ignoring "flink.size" might not be a bad idea.
> And I think we should also update the description of "-tm" (in
> "FlinkYarnSessionCli") to explicitly mention that it would overwrite
> "flink.size" and "process.size".
>
> Thank you~
>
> Xintong Song
>
>
>
> On Tue, Jan 14, 2020 at 2:24 AM Stephan Ewen <se...@apache.org> wrote:
>
> > Would be good to hear the thoughts of some more Yarn users, though.
> >
> > On Mon, Jan 13, 2020 at 7:23 PM Stephan Ewen <se...@apache.org> wrote:
> >
> > > I think we need an interpretation of "-tm" regardless of what is in the
> > > default configuration, because we can always have a modified
> > configuration
> > > and then a user passes the "-tm" flag.
> > >
> > > I kind of like the first proposal: Interpret "-tm" as "override memory
> > > size config and set the Yarn TM container size." It would mean exactly
> > > ignoring "taskmanager.memory.flink.size" and using the "-tm" value as "
> > > "taskmanager.memory.process.size".
> > > That does not sound too bad to me.
> > >
> > > Best,
> > > Stephan
> > >
> > >
> > > On Mon, Jan 13, 2020 at 5:35 PM Andrey Zagrebin <azagre...@apache.org>
> > > wrote:
> > >
> > >> Hi all,
> > >>
> > >> While working on changing process memory to Flink memory in default
> > >> configuration, Xintong encountered a problem.
> > >> When -tm option is used to rewrite container memory, basically process
> > >> memory, it can collide with the default Flink memory.
> > >> For legacy users it should not be a problem as we adjusted the legacy
> > heap
> > >> size option to be interpreted differently for standalone and container
> > >> modes.
> > >>
> > >> One solution could be to say in -tm docs that we rewrite both options
> > >> under
> > >> the hood: process and Flink memory, basically unset Flink memory from
> > yaml
> > >> config.
> > >> The downside is that this adds more magic.
> > >>
> > >> Alternatively, we can keep process memory in default config and, as
> > >> mentioned before, increase it to maintain the user experience by
> > matching
> > >> the previous default setting for heap (now Flink in standalone) size.
> > >> The Flink memory can be mentioned in process memory comment as a
> simpler
> > >> alternative which does not require accounting for JVM overhead.
> > >> The downside is again more confusion while trying out Flink and tuning
> > >> memory at the same time.
> > >> On the other hand, if memory already needs to be tuned it should
> > >> quite quickly lead to the necessity of understanding the memory model
> in
> > >> Flink.
> > >>
> > >> Best,
> > >> Andrey
> > >>
> > >> On Thu, Jan 9, 2020 at 12:27 PM Stephan Ewen <se...@apache.org>
> wrote:
> > >>
> > >> > Great! Thanks, guys, for the continued effort on this topic!
> > >> >
> > >> > On Thu, Jan 9, 2020 at 5:27 AM Xintong Song <tonysong...@gmail.com>
> > >> wrote:
> > >> >
> > >> > > Thanks all for the discussion. I believe we have get consensus on
> > all
> > >> the
> > >> > > open questions discussed in this thread.
> > >> > >
> > >> > > Since Andrey already create a jira ticket for renaming shuffle
> > memory
> > >> > > config keys with "taskmanager.memory.network.*", I'll create
> ticket
> > >> for
> > >> > the
> > >> > > other topic that puts flink.size in flink-conf.yaml.
> > >> > >
> > >> > > Thank you~
> > >> > >
> > >> > > Xintong Song
> > >> > >
> > >> > >
> > >> > >
> > >> > > On Wed, Jan 8, 2020 at 9:39 PM Andrey Zagrebin <
> > azagre...@apache.org>
> > >> > > wrote:
> > >> > >
> > >> > > > It also looks to me that we should only swap network and memory
> in
> > >> the
> > >> > > > option names: 'taskmanager.memory.network.*'.
> > >> > > > There is no strong consensus towards using new 'shuffle' naming
> so
> > >> we
> > >> > can
> > >> > > > just rename it to  'taskmanager.memory.network.*' as 'shuffle'
> > >> naming
> > >> > has
> > >> > > > never been released.
> > >> > > > When we have other shuffle services and start advertising more
> > this
> > >> > > concept
> > >> > > > in Flink, we could revisit again the whole naming for this
> > concept.
> > >> > > > https://jira.apache.org/jira/browse/FLINK-15517
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> >
>

Reply via email to