Hi, Xia

Thanks for your reply.

> That means, in terms
of priority, `table.exec.hive.infer-source-parallelism` >
`table.exec.hive.infer-source-parallelism.mode`.

I still have some confusion, if the
`table.exec.hive.infer-source-parallelism`
>`table.exec.hive.infer-source-parallelism.mode`, currently
`table.exec.hive.infer-source-parallelism` default value is true, that
means always static parallelism inference work? Or perhaps after this FLIP,
we changed the default behavior of
`table.exec.hive.infer-source-parallelism` to indicate dynamic parallelism
inference when enabled.
I think you should list the various behaviors of these two options that
coexist in FLIP by a table, only then users can know how the dynamic and
static parallelism inference work.

Best,
Ron

Xia Sun <xingbe...@gmail.com> 于2024年4月18日周四 16:33写道:

> Hi Ron and Lijie,
> Thanks for joining the discussion and sharing your suggestions.
>
> > the InferMode class should also be introduced in the Public Interfaces
> > section!
>
>
> Thanks for the reminder, I have now added the InferMode class to the Public
> Interfaces section as well.
>
> > `table.exec.hive.infer-source-parallelism.max` is 1024, I checked through
> > the code that the default value is 1000?
>
>
> I have checked and the default value of
> `table.exec.hive.infer-source-parallelism.max` is indeed 1000. This has
> been corrected in the FLIP.
>
> > how are`table.exec.hive.infer-source-parallelism` and
> > `table.exec.hive.infer-source-parallelism.mode` compatible?
>
>
> This is indeed a critical point. The current plan is to deprecate
> `table.exec.hive.infer-source-parallelism` but still utilize it as the main
> switch for enabling automatic parallelism inference. That means, in terms
> of priority, `table.exec.hive.infer-source-parallelism` >
> `table.exec.hive.infer-source-parallelism.mode`. In future versions, if
> `table.exec.hive.infer-source-parallelism` is removed, this logic will also
> need to be revised, leaving only
> `table.exec.hive.infer-source-parallelism.mode` as the basis for deciding
> whether to enable parallelism inference. I have also added this description
> to the FLIP.
>
>
> > In FLIP-367 it is supported to be able to set the Source's parallelism
> > individually, if in the future HiveSource also supports this feature,
> > however, the default value of
> > `table.exec.hive.infer-source-parallelism.mode` is `InferMode.DYNAMIC`,
> at
> > this point will the parallelism be dynamically derived or will the
> manually
> > set parallelism take effect, and who has the higher priority?
>
>
> From my understanding, 'manually set parallelism' has the higher priority,
> just like one of the preconditions for the effectiveness of dynamic
> parallelism inference in the AdaptiveBatchScheduler is that the vertex's
> parallelism isn't set. I believe whether it's static inference or dynamic
> inference, the manually set parallelism by the user should be respected.
>
> > The `InferMode.NONE` option.
>
> Currently, 'adding InferMode.NONE' seems to be the prevailing opinion. I
> will add InferMode.NONE as one of the Enum options in InferMode class.
>
> Best,
> Xia
>
> Lijie Wang <wangdachui9...@gmail.com> 于2024年4月18日周四 13:50写道:
>
> > Thanks for driving the discussion.
> >
> > +1 for the proposal and +1 for the `InferMode.NONE` option.
> >
> > Best,
> > Lijie
> >
> > Ron liu <ron9....@gmail.com> 于2024年4月18日周四 11:36写道:
> >
> > > Hi, Xia
> > >
> > > Thanks for driving this FLIP.
> > >
> > > This proposal looks good to me overall. However, I have the following
> > minor
> > > questions:
> > >
> > > 1. FLIP introduced `table.exec.hive.infer-source-parallelism.mode` as a
> > new
> > > parameter, and the value is the enum class `InferMode`, I think the
> > > InferMode class should also be introduced in the Public Interfaces
> > section!
> > > 2. You mentioned in FLIP that the default value of
> > > `table.exec.hive.infer-source-parallelism.max` is 1024, I checked
> through
> > > the code that the default value is 1000?
> > > 3. I also agree with Muhammet's idea that there is no need to introduce
> > the
> > > option `table.exec.hive.infer-source-parallelism.enabled`, and that
> > > expanding the InferMode values will fulfill the need. There is another
> > > issue to consider here though, how are
> > > `table.exec.hive.infer-source-parallelism` and
> > > `table.exec.hive.infer-source-parallelism.mode` compatible?
> > > 4. In FLIP-367 it is supported to be able to set the Source's
> parallelism
> > > individually, if in the future HiveSource also supports this feature,
> > > however, the default value of
> > > `table.exec.hive.infer-source-parallelism.mode` is `InferMode.
> DYNAMIC`,
> > at
> > > this point will the parallelism be dynamically derived or will the
> > manually
> > > set parallelism take effect, and who has the higher priority?
> > >
> > > Best,
> > > Ron
> > >
> > > Xia Sun <xingbe...@gmail.com> 于2024年4月17日周三 12:08写道:
> > >
> > > > Hi Jeyhun, Muhammet,
> > > > Thanks for all the feedback!
> > > >
> > > > > Could you please mention the default values for the new
> > configurations
> > > > >     (e.g., table.exec.hive.infer-source-parallelism.mode,
> > > > >     table.exec.hive.infer-source-parallelism.enabled,
> > > > >     etc) ?
> > > >
> > > >
> > > > Thanks for your suggestion. I have supplemented the explanation
> > regarding
> > > > the default values.
> > > >
> > > > > Since we are introducing the mode as a configuration option,
> > > > >     could it make sense to have `InferMode.NONE` option also?
> > > > >     The `NONE` option would disable the inference.
> > > >
> > > >
> > > > This is a good idea. Looking ahead, it could eliminate the need for
> > > > introducing
> > > > a new configuration option. I haven't identified any potential
> > > > compatibility issues
> > > > as yet. If there are no further ideas from others, I'll go ahead and
> > > update
> > > > the FLIP to
> > > > introducing InferMode.NONE.
> > > >
> > > > Best,
> > > > Xia
> > > >
> > > > Muhammet Orazov <mor+fl...@morazow.com.invalid> 于2024年4月17日周三
> 10:31写道:
> > > >
> > > > > Hello Xia,
> > > > >
> > > > > Thanks for the FLIP!
> > > > >
> > > > > Since we are introducing the mode as a configuration option,
> > > > > could it make sense to have `InferMode.NONE` option also?
> > > > > The `NONE` option would disable the inference.
> > > > >
> > > > > This way we deprecate the
> `table.exec.hive.infer-source-parallelism`
> > > > > and no additional
> `table.exec.hive.infer-source-parallelism.enabled`
> > > > > option is required.
> > > > >
> > > > > What do you think?
> > > > >
> > > > > Best,
> > > > > Muhammet
> > > > >
> > > > > On 2024-04-16 07:07, Xia Sun wrote:
> > > > > > Hi everyone,
> > > > > > I would like to start a discussion on FLIP-445: Support dynamic
> > > > > > parallelism
> > > > > > inference for HiveSource[1].
> > > > > >
> > > > > > FLIP-379[2] has introduced dynamic source parallelism inference
> for
> > > > > > batch
> > > > > > jobs, which can utilize runtime information to more accurately
> > decide
> > > > > > the
> > > > > > source parallelism. As a follow-up task, we plan to implement the
> > > > > > dynamic
> > > > > > parallelism inference interface for HiveSource, and also switch
> the
> > > > > > default
> > > > > > static parallelism inference to dynamic parallelism inference.
> > > > > >
> > > > > > Looking forward to your feedback and suggestions, thanks.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-445%3A+Support+dynamic+parallelism+inference+for+HiveSource
> > > > > > [2]
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-379%3A+Dynamic+source+parallelism+inference+for+batch+jobs
> > > > > >
> > > > > > Best regards,
> > > > > > Xia
> > > > >
> > > >
> > >
> >
>

Reply via email to