[
https://issues.apache.org/jira/browse/FLINK-23952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17404845#comment-17404845
]
Xintong Song commented on FLINK-23952:
--------------------------------------
## Why it worked fine in 1.13.1 but not in 1.13.2
It is designed that the cpu cores and all memory sizes should be calculated
before starting the java process, and they should be explicitly set via
configuration options. Notice that this could overwrite existing
configurations. E.g., the user may configure a [min, max] range for the network
memory size, and Flink's automatic calculation logic should decide a specific
value within that range and set both min/max config options to that value,
making sure it stays consistent during the entire lifecycle of the process.
There are internally logics inside the task manager that rely on the assumption
that all cpu/memory config options should be explicitly set. E.g., Flink uses
the min value from configuration as the network memory size, expecting max
should be configured to the same value. However, Flink did not check whether
all such options are explicitly configured. That explains how your scripts
worked fine in 1.13.1. Despite no serious problems were observed, the memory
management may not worked as designed/expected, in terms of stability and
resource efficiency.
## Running flink with custom scripts
If the build-in scripts do not satisfy your demands, it should work calling
BashJavaUtils from your custom scripts. The key point is to calculate and
configure the resources in advance and consistently as the other flink
components expect. However, as [~chesnay] mentioned, there's no guarantee that
these things will stay compatible in future releases. They can be changed
anytime without notice, which means you may run into these kind of problems
again in future.
Alternatively, you may consider to file jira tickets for your complaints about
the build-in scripts. That would be an appreciated contribution for the
community.
BTW, I think taskmanager.sh does not need to read FLINK_PLUGINS_DIR, because it
is exported as an environment variable and is read directly by the java process.
> Taskmanager fails to start complaining about missing configuration option
> -------------------------------------------------------------------------
>
> Key: FLINK-23952
> URL: https://issues.apache.org/jira/browse/FLINK-23952
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Configuration
> Affects Versions: 1.13.2
> Reporter: Leonid Ilyevsky
> Priority: Major
> Attachments: flink-conf.yaml, taskmanager.log, taskmanager_start.txt
>
>
> Taskmanager now fails to start, after I upgraded to 1.13.2. It worked fine in
> 1.13.1.
> It suddenly started complaining about missing configuration options that are
> not really required, according to documentation. When I tried to set the one
> it complained about, it started complaining about another one.
>
> Please see attached files:
> taskmanager_start.txt - actual command that is used to start the program
> flink-conf.yaml - configuration file
> taskmanager.log - logfile where you can see the exception
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)