Hi Zakelly,

Thanks for driving the discussion.

1.
>> But I'm not so sure since there is only one savepoint-related option.
Maybe someone else could share some thoughts here.

How about
state.savepoints.dir -> execution.checkpointing.savepoint.dir
state.checkpoints.dir -> execution.checkpointing.checkpoint.dir

2. We changed the execution.checkpointing.local-copy' to
'execution.checkpointing.local-copy.enabled'. Should we also add "enabled"
suffix for other boolean type configuration options ? For example,
execution.checkpointing.incremental ->
execution.checkpointing.incremental.enabled

In this way, the naming style of configuration options is unified, and it
can avoid potential similar problems (for example, we may need to add more
options for incremental checkpoint in the future).

Best,
Lijie

Yanfei Lei <fredia...@gmail.com> 于2023年12月26日周二 12:05写道:

> Hi Zakelly,
>
> Thank you for creating the FLIP and starting the discussion.
>
> The current arrangement of these options is indeed somewhat haphazard,
> and the new arrangement looks much better. I have some questions about
> the arrangement of some new configuration options:
>
> 1. For some state backends that do not support incremental checkpoint,
> how does the execution.checkpointing.incrementaloption take effect? Or
> is it better to put incremental under state.backend.xxx.incremental?
>
> 2. I'm a little worried that putting all configurations into
> `ExecutionCheckpointingOptions` will introduce some dependency
> problems. Some options would be used by flink-runtime module, but
> flink-runtime should not depend on flink-streaming-java. e.g.
> FLINK-28286[1].
> So, I prefer to move configurations to `CheckpointingOptions`, WDYT?
>
> [1] https://issues.apache.org/jira/browse/FLINK-28286
>
> --
> Best,
> Yanfei
>
> Zakelly Lan <zakelly....@gmail.com> 于2023年12月25日周一 21:14写道:
> >
> > Hi Rui Fan and Junrui,
> >
> > Thanks for the reminder! I agree to change the
> > 'execution.checkpointing.local-copy' to
> > 'execution.checkpointing.local-copy.enabled'.
> >
> > And for other suggestions Rui proposed:
> >
> > 1. How about execution.checkpointing.storage.type instead
> > > of execution.checkpointing.storage?
> >
> >
> > Ah, I missed something here. Actually I suggest we could merge the
> current
> > 'state.checkpoints.dir' and 'state.checkpoint-storage' into one URI
> > configuration named 'execution.checkpointing.dir'. WDYT?
> >
> > 3. execution.checkpointing.savepoint.dir is a little weird.
> > >
> >
> > Yes, I think it is better to make 'savepoint' and 'checkpoint' the same
> > level. But I'm not so sure since there is only one savepoint-related
> > option. Maybe someone else could share some thoughts here.
> >
> > 4. How about execution.recovery.claim-mode instead of
> > > execution.recovery.mode?
> > >
> >
> >  Agreed. That's more accurate.
> >
> >
> > Many thanks for your suggestions!
> >
> > Best,
> > Zakelly
> >
> > On Mon, Dec 25, 2023 at 8:18 PM Junrui Lee <jrlee....@gmail.com> wrote:
> >
> > > Hi Zakelly,
> > >
> > > Thanks for driving this. I agree that the proposed restructuring of the
> > > configuration options is largely positive. It will make understanding
> and
> > > working with Flink configurations more intuitive.
> > >
> > > Most of the proposed changes look great. Just a heads-up, as Rui Fan
> > > mentioned, Flink currently requires that no configOption's key be the
> > > prefix of another to avoid issues when we eventually adopt a standard
> YAML
> > > parser, as detailed in FLINK-29372 (
> > > https://issues.apache.org/jira/browse/FLINK-29372). Therefore, it's
> better
> > > to change the key 'execution.checkpointing.local-copy' because it
> serves as
> > > a prefix to the key 'execution.checkpointing.local-copy.dir'.
> > >
> > > Best regards,
> > > Junrui
> > >
> > > Rui Fan <1996fan...@gmail.com> 于2023年12月25日周一 19:11写道:
> > >
> > > > Hi Zakelly,
> > > >
> > > > Thank you for driving this proposal!
> > > >
> > > > Overall good for me. I have some questions about these names.
> > > >
> > > > 1. How about execution.checkpointing.storage.type instead of
> > > > execution.checkpointing.storage?
> > > >
> > > > It's similar to state.backend.type.
> > > >
> > > > 2. How about execution.checkpointing.local-copy.enabled instead of
> > > > execution.checkpointing.local-copy?
> > > >
> > > > You added a new option: execution.checkpointing.local-copy.dir.
> > > > IIUC, one option name shouldn't be the prefix of other options.
> > > > If you add a new option execution.checkpointing.local-copy,
> > > > flink CI will fail directly.
> > > >
> > > > 3. execution.checkpointing.savepoint.dir is a little weird.
> > > >
> > > > For old options: state.savepoints.dir and state.checkpoints.dir,
> > > > the savepoint and checkpoint are the same level. It means
> > > > it's a checkpoint or savepoint.
> > > >
> > > > The new option execution.checkpointing.dir is fine for me.
> > > > However, execution.checkpointing.savepoint.dir is a little weird.
> > > > I don't know which name is better now. Let us think about it more.
> > > >
> > > > 4. How about execution.recovery.claim-mode instead of
> > > > execution.recovery.mode?
> > > >
> > > > The meaning of mode is too broad. The claim-mode may
> > > > be more accurate for users.
> > > >
> > > > WDYT?
> > > >
> > > > Best,
> > > > Rui
> > > >
> > > > On Mon, Dec 25, 2023 at 5:14 PM Zakelly Lan <zakelly....@gmail.com>
> > > wrote:
> > > >
> > > > > Hi devs,
> > > > >
> > > > > I'd like to start a discussion on FLIP-406: Reorganize State &
> > > > > Checkpointing & Recovery Configuration[1].
> > > > >
> > > > > Currently, the configuration options pertaining to checkpointing,
> > > > recovery,
> > > > > and state management are primarily grouped under the following
> > > prefixes:
> > > > >
> > > > >    - state.backend.* : configurations related to state accessing
> and
> > > > >    checkpointing, as well as specific options for individual state
> > > > backends
> > > > >    - execution.checkpointing.* : configurations associated with
> > > > checkpoint
> > > > >    execution and recovery
> > > > >    - execution.savepoint.*: configurations for recovery from
> savepoint
> > > > >
> > > > > In addition, there are several individual options such as '
> > > > > *state.checkpoint-storage*' and '*state.checkpoints.dir*' that fall
> > > > outside
> > > > > of these prefixes. The current arrangement of these options, which
> span
> > > > > multiple modules, is somewhat haphazard and lacks a systematic
> > > structure.
> > > > > For example, the options under the '*CheckpointingOptions*' and '
> > > > > *ExecutionCheckpointingOptions*' are related and have no clear
> > > boundaries
> > > > > from the user's perspective, but there is no unified prefix for
> them.
> > > > With
> > > > > the upcoming release of Flink 2.0, we have an excellent
> opportunity to
> > > > > overhaul and restructure the configurations related to
> checkpointing,
> > > > > recovery, and state management. This FLIP proposes to reorganize
> these
> > > > > settings, making it more coherent by module, which would
> significantly
> > > > > lower the barriers for understanding and reduce the development
> costs
> > > > > moving forward.
> > > > >
> > > > > Looking forward to hearing from you!
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=284789560
> > > > >
> > > > > Best,
> > > > > Zakelly
> > > > >
> > > >
> > >
>

Reply via email to