Re: [DISCUSS] FLIP-445: Support dynamic parallelism inference for HiveSource

2024-04-25 Thread Xia Sun
Hi Venkat, Thanks for joining the discussion. Based on our understanding, there are still a significant number of existing tasks using Hive. Indeed, many companies are now migrating their data to the lakehouse, but due to historical reasons, a substantial amount of data still resides in Hive.

Re: [DISCUSS] FLIP-445: Support dynamic parallelism inference for HiveSource

2024-04-24 Thread Venkatakrishnan Sowrirajan
Hi Xia, +1 on introducing dynamic parallelism inference for HiveSource. Orthogonal to this discussion, curious, how commonly HiveSource is used these days in the industry given the popularity of table formats/sources like Iceberg, Hudi and Delta lake? Thanks Venkat On Wed, Apr 24, 2024, 7:41 

Re: [DISCUSS] FLIP-445: Support dynamic parallelism inference for HiveSource

2024-04-24 Thread Xia Sun
Hi everyone, Thanks for all the feedback! If there are no more comments, I would like to start the vote thread, thanks again! Best, Xia Ahmed Hamdy 于2024年4月18日周四 21:31写道: > Hi Xia, > I have read through the FLIP and discussion and the new version of the FLIP > looks better. > +1 for the

Re: [DISCUSS] FLIP-445: Support dynamic parallelism inference for HiveSource

2024-04-18 Thread Ahmed Hamdy
Hi Xia, I have read through the FLIP and discussion and the new version of the FLIP looks better. +1 for the proposal. Best Regards Ahmed Hamdy On Thu, 18 Apr 2024 at 12:21, Ron Liu wrote: > Hi, Xia > > Thanks for updating, looks good to me. > > Best, > Ron > > Xia Sun 于2024年4月18日周四 19:11写道:

Re: [DISCUSS] FLIP-445: Support dynamic parallelism inference for HiveSource

2024-04-18 Thread Ron Liu
Hi, Xia Thanks for updating, looks good to me. Best, Ron Xia Sun 于2024年4月18日周四 19:11写道: > Hi Ron, > Yes, presenting it in a table might be more intuitive. I have already added > the table in the "Public Interfaces | New Config Option" chapter of FLIP. > PTAL~ > > Ron Liu 于2024年4月18日周四

Re: [DISCUSS] FLIP-445: Support dynamic parallelism inference for HiveSource

2024-04-18 Thread Xia Sun
Hi Ron, Yes, presenting it in a table might be more intuitive. I have already added the table in the "Public Interfaces | New Config Option" chapter of FLIP. PTAL~ Ron Liu 于2024年4月18日周四 18:10写道: > Hi, Xia > > Thanks for your reply. > > > That means, in terms > of priority,

Re: [DISCUSS] FLIP-445: Support dynamic parallelism inference for HiveSource

2024-04-18 Thread Ron Liu
Hi, Xia Thanks for your reply. > That means, in terms of priority, `table.exec.hive.infer-source-parallelism` > `table.exec.hive.infer-source-parallelism.mode`. I still have some confusion, if the `table.exec.hive.infer-source-parallelism` >`table.exec.hive.infer-source-parallelism.mode`,

Re: [DISCUSS] FLIP-445: Support dynamic parallelism inference for HiveSource

2024-04-18 Thread Xia Sun
Hi Ron and Lijie, Thanks for joining the discussion and sharing your suggestions. > the InferMode class should also be introduced in the Public Interfaces > section! Thanks for the reminder, I have now added the InferMode class to the Public Interfaces section as well. >

Re: [DISCUSS] FLIP-445: Support dynamic parallelism inference for HiveSource

2024-04-17 Thread Lijie Wang
Thanks for driving the discussion. +1 for the proposal and +1 for the `InferMode.NONE` option. Best, Lijie Ron liu 于2024年4月18日周四 11:36写道: > Hi, Xia > > Thanks for driving this FLIP. > > This proposal looks good to me overall. However, I have the following minor > questions: > > 1. FLIP

Re: [DISCUSS] FLIP-445: Support dynamic parallelism inference for HiveSource

2024-04-17 Thread Ron liu
Hi, Xia Thanks for driving this FLIP. This proposal looks good to me overall. However, I have the following minor questions: 1. FLIP introduced `table.exec.hive.infer-source-parallelism.mode` as a new parameter, and the value is the enum class `InferMode`, I think the InferMode class should

Re: [DISCUSS] FLIP-445: Support dynamic parallelism inference for HiveSource

2024-04-16 Thread Xia Sun
Hi Jeyhun, Muhammet, Thanks for all the feedback! > Could you please mention the default values for the new configurations > (e.g., table.exec.hive.infer-source-parallelism.mode, > table.exec.hive.infer-source-parallelism.enabled, > etc) ? Thanks for your suggestion. I have

Re: [DISCUSS] FLIP-445: Support dynamic parallelism inference for HiveSource

2024-04-16 Thread Muhammet Orazov
Hello Xia, Thanks for the FLIP! Since we are introducing the mode as a configuration option, could it make sense to have `InferMode.NONE` option also? The `NONE` option would disable the inference. This way we deprecate the `table.exec.hive.infer-source-parallelism` and no additional

Re: [DISCUSS] FLIP-445: Support dynamic parallelism inference for HiveSource

2024-04-16 Thread Jeyhun Karimov
Hi Xia, Thanks for driving this FLIP. +1 from my side. I have one comment. Could you please mention the default values for the new configurations (e.g., table.exec.hive.infer-source-parallelism.mode, table.exec.hive.infer-source-parallelism.enabled, etc) ? Regards, Jeyhun On Tue, Apr 16, 2024

Re: [DISCUSS] FLIP-445: Support dynamic parallelism inference for HiveSource

2024-04-16 Thread Zhu Zhu
Thanks for creating this FLIP. @Xia +1 for this proposal. Dynamic parallelism inference can be helpful to decide a better parallelism. And it's good to unify the settings of static & dynamic parallelism inference. Thanks, Zhu Xia Sun 于2024年4月16日周二 15:12写道: > Hi everyone, > I would like to