>
> These features don't have a public option, but they work. I'm not sure
> whether these features are used by some advanced users.
> Actually, I think some of them are valuable! For example:
>
> - ConfigConstants.YARN_CONTAINER_START_COMMAND_TEMPLATE
>   allows users to define the start command of the yarn container.
> - FileInputFormat.ENUMERATE_NESTED_FILES_FLAG allows
>   flink job reads all files under the directory even if it has nested
> directories.
>
> This FLIP focuses on the refactor option, I'm afraid these features are
> used
> in some production and removing these features will affect some flink jobs.
> So I prefer to keep these features, WDTY?
>

First of all, I don't think we should support any knobs that users can only
learn how to use from reading Flink's internal codes. From this
perspective, for existing string-keyed knobs that are not mentioned in any
public documentation, yes we can argue that they are functioning, but we
can also argue that they are not really exposed to users. That means
migrating them to ConfigOption is not a pure refactor, but would make
something that used to be hidden from users now exposed to users. For such
options, I personally would lean toward not exposing them. If we consider
them as already exposed, then process-wise there's no problem in
deprecating some infrequently-used options and removing them in a major
version bump, and if they are proved needed later we can add them back
anytime. On the other hand, if we consider them as not yet exposed, then
removing them later would be a breaking change.


Secondly, I don't really come up with any cases where users need to tune
these knobs. E.g., why would we allow users to customize the yarn container
start command while we already provide `env.java.opts`? And what would be
the problem if Flink just doesn't support nested files? And even worse,
such knobs may provide chances for users to shoot themself in the foot.
E.g., what if %jvmmem% is missing from a user-provided container start
command? Admittedly, there might be a small fraction of advanced users that
know how to use these knobs. However, those users usually have their own
custom fork of Flink, and it should not be a big problem for them to build
such abilities by themselves.


Taking a step back, if we decide that some of the mentioned knobs are
really useful, we should at least provide enough descriptions to help users
understand when and how to use these options. E.g., the current description
for yarn container command template is far from enough, which does not
explain what the placeholders mean, what happens if some placeholders are
missing or if an unknown placeholder is provided.


Best,

Xintong



On Tue, Dec 26, 2023 at 7:39 PM Rui Fan <1996fan...@gmail.com> wrote:

> In addition to what is written in FLIP, I found some methods of
> Configuration
> are not necessary. And I wanna hear more thoughts from all of you.
>
> Configuration has a `public <T> T get(ConfigOption<T> option)` method.
> Could we remove all `Xxx getXxx(ConfigOption<Xxx> configOption)` methods?
> Such as:
> - public int getInteger(ConfigOption<Integer> configOption)
> - public String getString(ConfigOption<String> configOption)
> - public long getLong(ConfigOption<Long> configOption)
> - etc
>
> I prefer users and flink use `get(ConfigOption<T> option)` instead of
> `getXxx(ConfigOption<Xxx> configOption)` based on some reasons:
>
> 1. All callers can replace it directly without any extra effort.
>   `T get(ConfigOption<T> option)` method can replace all
>   `Xxx getXxx(ConfigOption<Xxx> configOption)` methods directly.
> 2. Callers can call get directly, and users or flink developers
>   don't need to care about should they call getInteger or getString.
> 3. Flink code is easier to maintain.
> 4. `T get(ConfigOption<T> option)` is designed later than
>     `Xxx getXxx(ConfigOption<Xxx> configOption)`, I guess if
>     `T get(ConfigOption<T> option)` is designed first,
>    all `Xxx getXxx(ConfigOption<Xxx> configOption)` methods
>   aren't needed.
>
> Note: all `public void setXxx(ConfigOption<Xxx> key, Xxx value)` methods
> can be replaced with `public <T> Configuration set(ConfigOption<T> option,
> T value)` as well.
>
> If they can be marked as deprecated, only `public Xxx
> getXxx(ConfigOption<Xxx> configOption, Xxx overrideDefault)`
> is keeped after this FLIP. And the rest of getXxx or setXxx can be removed
> in 2.0.
>
> Looking forward to everyone's feedback and suggestions, thank you!
>
> Best,
> Rui
>
> On Tue, Dec 26, 2023 at 7:15 PM Rui Fan <1996fan...@gmail.com> wrote:
>
> > Thanks Xintong for the quick feedback and the good suggestions!
> >
> > > 1. I think the default value for `TASK_MANAGER_LOG_PATH_KEY` should be
> > "no
> > > default". We can explain in the description that if not configured,
> > > `System.getProperty("log.file")` will be used, but that is not a
> default
> > > value.
> >
> > Explain it in the description is fine for me.
> >
> > > 2. So I wonder if we can simply mark them as deprecated and remove in
> > 2.0.
> >
> > These features don't have a public option, but they work. I'm not sure
> > whether these features are used by some advanced users.
> > Actually, I think some of them are valuable! For example:
> >
> > - ConfigConstants.YARN_CONTAINER_START_COMMAND_TEMPLATE
> >   allows users to define the start command of the yarn container.
> > - FileInputFormat.ENUMERATE_NESTED_FILES_FLAG allows
> >   flink job reads all files under the directory even if it has nested
> > directories.
> >
> > This FLIP focuses on the refactor option, I'm afraid these features are
> > used
> > in some production and removing these features will affect some flink
> jobs.
> > So I prefer to keep these features, WDTY?
> >
> > > 3. Simply saying "getting / setting value with string key is
> discouraged"
> > > in JavaDoc of get/setString is IMHO a bit confusing. People may have
> the
> > > question why would we keep the discouraged interfaces at all. I would
> > > suggest the following:
> > > ```
> > > We encourage users and developers to always use ConfigOption for
> getting
> > /
> > > setting the configurations if possible, for its rich description, type,
> > > default-value and other supports. The string-key-based getter / setter
> > > should only be used when ConfigOption is not applicable, e.g., the key
> is
> > > programmatically generated in runtime.
> > > ```
> >
> > Suggested comment is good for me, and I'd like to hear the thought from
> > Xuannan
> > who wrote the original comment.
> >
> > Best,
> > Rui
> >
> > On Tue, Dec 26, 2023 at 5:26 PM Xintong Song <tonysong...@gmail.com>
> > wrote:
> >
> >> Thanks for the efforts, Rui and Xuannan.
> >>
> >> I think it's a good idea to migrate string-key configuration accesses to
> >> ConfigOption-s in general.
> >>
> >> I have a few suggestions / questions regarding the FLIP.
> >>
> >> 1. I think the default value for `TASK_MANAGER_LOG_PATH_KEY` should be
> "no
> >> default". We can explain in the description that if not configured,
> >> `System.getProperty("log.file")` will be used, but that is not a default
> >> value.
> >>
> >> 2. I wonder if the following string-keys can be simply removed? They are
> >> neither set by Flink, nor documented anywhere (AFAIK) so that users know
> >> how to set them. All of them were introduced a long time ago, require
> >> significant knowledge and familiarity about Flink internals and low
> level
> >> details in order to use, and some of them are even private. So I wonder
> if
> >> we can simply mark them as deprecated and remove in 2.0.
> >> - ConfigConstants.YARN_CONTAINER_START_COMMAND_TEMPLATE
> >> - FileInputFormat.FILE_PARAMETER_KEY
> >> - FileInputFormat.ENUMERATE_NESTED_FILES_FLAG
> >> - FileOutputFormat.FILE_PARAMETER_KEY
> >> - BinaryInputFormat.BLOCK_SIZE_PARAMETER_KEY
> >> - BinaryOutputFormat.BLOCK_SIZE_PARAMETER_KEY
> >>
> >> 3. Simply saying "getting / setting value with string key is
> discouraged"
> >> in JavaDoc of get/setString is IMHO a bit confusing. People may have the
> >> question why would we keep the discouraged interfaces at all. I would
> >> suggest the following:
> >> ```
> >> We encourage users and developers to always use ConfigOption for
> getting /
> >> setting the configurations if possible, for its rich description, type,
> >> default-value and other supports. The string-key-based getter / setter
> >> should only be used when ConfigOption is not applicable, e.g., the key
> is
> >> programmatically generated in runtime.
> >> ```
> >>
> >> WDYT?
> >>
> >> Best,
> >>
> >> Xintong
> >>
> >>
> >>
> >> On Tue, Dec 26, 2023 at 4:12 PM Rui Fan <1996fan...@gmail.com> wrote:
> >>
> >> > Hi all,
> >> >
> >> > Xuannan(cced) and I would like to start a discussion on FLIP-405:
> >> > FLIP-405: Migrate string configuration key to ConfigOption[1].
> >> >
> >> > As Flink progresses to 2.0, we want to enhance the user experience
> >> > with the existing configuration. In FLIP-77, Flink introduced
> >> ConfigOption
> >> > with DataType and strongly encourage users to utilize ConfigOption
> >> > instead of string keys for accessing and setting Flink configurations.
> >> > Presently, many string configuration keys have been deprecated and
> >> > replaced with ConfigOptions; however, some string configuration
> >> > keys are still in use.
> >> >
> >> > To ensure a better experience with the existing configuration in Flink
> >> > 2.0,
> >> > this FLIP will migrate all user-facing string configuration keys to
> >> > ConfigOptions.
> >> > Additionally, we want to modify the Configuration infrastructure to
> >> > promote the use of ConfigOption over string configuration keys
> >> > among developers and users. It's mentioned in a preview thread[2].
> >> >
> >> > Looking forward to everyone's feedback and suggestions, thank you!
> >> >
> >> > [1] https://cwiki.apache.org/confluence/x/6Yr5E
> >> > [2] https://lists.apache.org/thread/zzsf7glfcdjcjm1hfo1xdwc6jp37nb3m
> >> >
> >> > Best,
> >> > Rui
> >> >
> >>
> >
>

Reply via email to