True, even we have "process.size" rather than "flink.size" in the default config file, user can still have "flink.size" in their custom config file. If we consider "-tm" as a shortcut for users to override the TM memory size, then it makes less sense to require users to remove "flink.size" from their config file whenever then want to use "-tm".
I'm convinced that ignoring "flink.size" might not be a bad idea. And I think we should also update the description of "-tm" (in "FlinkYarnSessionCli") to explicitly mention that it would overwrite "flink.size" and "process.size". Thank you~ Xintong Song On Tue, Jan 14, 2020 at 2:24 AM Stephan Ewen <se...@apache.org> wrote: > Would be good to hear the thoughts of some more Yarn users, though. > > On Mon, Jan 13, 2020 at 7:23 PM Stephan Ewen <se...@apache.org> wrote: > > > I think we need an interpretation of "-tm" regardless of what is in the > > default configuration, because we can always have a modified > configuration > > and then a user passes the "-tm" flag. > > > > I kind of like the first proposal: Interpret "-tm" as "override memory > > size config and set the Yarn TM container size." It would mean exactly > > ignoring "taskmanager.memory.flink.size" and using the "-tm" value as " > > "taskmanager.memory.process.size". > > That does not sound too bad to me. > > > > Best, > > Stephan > > > > > > On Mon, Jan 13, 2020 at 5:35 PM Andrey Zagrebin <azagre...@apache.org> > > wrote: > > > >> Hi all, > >> > >> While working on changing process memory to Flink memory in default > >> configuration, Xintong encountered a problem. > >> When -tm option is used to rewrite container memory, basically process > >> memory, it can collide with the default Flink memory. > >> For legacy users it should not be a problem as we adjusted the legacy > heap > >> size option to be interpreted differently for standalone and container > >> modes. > >> > >> One solution could be to say in -tm docs that we rewrite both options > >> under > >> the hood: process and Flink memory, basically unset Flink memory from > yaml > >> config. > >> The downside is that this adds more magic. > >> > >> Alternatively, we can keep process memory in default config and, as > >> mentioned before, increase it to maintain the user experience by > matching > >> the previous default setting for heap (now Flink in standalone) size. > >> The Flink memory can be mentioned in process memory comment as a simpler > >> alternative which does not require accounting for JVM overhead. > >> The downside is again more confusion while trying out Flink and tuning > >> memory at the same time. > >> On the other hand, if memory already needs to be tuned it should > >> quite quickly lead to the necessity of understanding the memory model in > >> Flink. > >> > >> Best, > >> Andrey > >> > >> On Thu, Jan 9, 2020 at 12:27 PM Stephan Ewen <se...@apache.org> wrote: > >> > >> > Great! Thanks, guys, for the continued effort on this topic! > >> > > >> > On Thu, Jan 9, 2020 at 5:27 AM Xintong Song <tonysong...@gmail.com> > >> wrote: > >> > > >> > > Thanks all for the discussion. I believe we have get consensus on > all > >> the > >> > > open questions discussed in this thread. > >> > > > >> > > Since Andrey already create a jira ticket for renaming shuffle > memory > >> > > config keys with "taskmanager.memory.network.*", I'll create ticket > >> for > >> > the > >> > > other topic that puts flink.size in flink-conf.yaml. > >> > > > >> > > Thank you~ > >> > > > >> > > Xintong Song > >> > > > >> > > > >> > > > >> > > On Wed, Jan 8, 2020 at 9:39 PM Andrey Zagrebin < > azagre...@apache.org> > >> > > wrote: > >> > > > >> > > > It also looks to me that we should only swap network and memory in > >> the > >> > > > option names: 'taskmanager.memory.network.*'. > >> > > > There is no strong consensus towards using new 'shuffle' naming so > >> we > >> > can > >> > > > just rename it to 'taskmanager.memory.network.*' as 'shuffle' > >> naming > >> > has > >> > > > never been released. > >> > > > When we have other shuffle services and start advertising more > this > >> > > concept > >> > > > in Flink, we could revisit again the whole naming for this > concept. > >> > > > https://jira.apache.org/jira/browse/FLINK-15517 > >> > > > > >> > > > >> > > >> > > >