Hi David, Thx for your feedback.
First of all, for keeping some spare resources around, do you mean 'Redundant TaskManagers'[1]? If not, what is the difference between the spare resources and redundant taskmanagers? Secondly, IMHO the difference between min-reserved resource and spare resources is that we could configure a rather large min-reserved resource for user cases submitting lots of short-lived jobs concurrently, but we don't want to configure a large spare resource since this might double the total resource usage and lead to resource waste. Looking forward to hearing from you. Regards, Xiangyu [1] https://issues.apache.org/jira/browse/FLINK-18625 David Morávek <d...@apache.org> 于2023年10月3日周二 05:00写道: > H Xiangyui, > > The sentiment of the FLIP makes sense, but I keep wondering whether this > is the best way to think about the problem. I assume that "interactive > session cluster" users always want to keep some spare resources around (up > to a configured threshold) to reduce cold start instead of statically > configuring the minimum. > > It's just a tiny change from the original proposal, but it could make all > the difference (eliminate overprovisioning, maintain latencies with a > growing # of jobs, ..) > > WDYT? > > Best, > D. > > On Mon, Sep 25, 2023 at 5:11 PM Jing Ge <j...@ververica.com.invalid> > wrote: > >> Hi Yangze, >> >> Thanks for the clarification. The example of two batch jobs team up with >> one streaming job is interesting. >> >> Best regards, >> Jing >> >> On Wed, Sep 20, 2023 at 7:19 PM Yangze Guo <karma...@gmail.com> wrote: >> >> > Thanks for the comments, Jing. >> > >> > > Will the minimum resource configuration also take effect for streaming >> > jobs in application mode? >> > > Since it is not recommended to configure >> slotmanager.number-of-slots.max >> > for streaming jobs, does it make sense to disable it for common >> streaming >> > jobs? At least disable the check for avoiding the oscillation? >> > >> > Yes. The minimum resource configuration will only disabled in >> > standalone cluster atm. I agree it make sense to disable it for a pure >> > streaming job, however: >> > - By default, the minimum resource is configured to 0. If users do not >> > proactively set it, either the oscillation check or the minimum >> > restriction can be considered as disabled. >> > - The minimum resource is a cluster-level configuration rather than a >> > job-level configuration. If a user has an application with two batch >> > jobs preceding the streaming job, they may also require this >> > configuration to accelerate the execution of batch jobs. >> > >> > WDYT? >> > >> > Best, >> > Yangze Guo >> > >> > On Thu, Sep 21, 2023 at 4:49 AM Jing Ge <j...@ververica.com.invalid> >> > wrote: >> > > >> > > Hi Xiangyu, >> > > >> > > Thanks for driving it! There is one thing I am not really sure if I >> > > understand you correctly. >> > > >> > > According to the FLIP: "The minimum resource limitation will be >> > implemented >> > > in the DefaultResourceAllocationStrategy of FineGrainedSlotManager. >> > > >> > > Each time when SlotManager needs to reconcile the cluster resources or >> > > fulfill job resource requirements, the >> DefaultResourceAllocationStrategy >> > > will check if the minimum resource requirement has been fulfilled. If >> it >> > is >> > > not, DefaultResourceAllocationStrategy will request new >> > PendingTaskManagers >> > > and FineGrainedSlotManager will allocate new worker resources >> > accordingly." >> > > >> > > "To avoid this oscillation, we need to check the worker number derived >> > from >> > > minimum and maximum resource configuration is consistent before >> starting >> > > SlotManager." >> > > >> > > Will the minimum resource configuration also take effect for streaming >> > jobs >> > > in application mode? Since it is not recommended to >> > > configure slotmanager.number-of-slots.max for streaming jobs, does it >> > make >> > > sense to disable it for common streaming jobs? At least disable the >> check >> > > for avoiding the oscillation? >> > > >> > > Best regards, >> > > Jing >> > > >> > > >> > > On Tue, Sep 19, 2023 at 4:58 PM Chen Zhanghao < >> zhanghao.c...@outlook.com >> > > >> > > wrote: >> > > >> > > > Thanks for driving this, Xiangyu. We use Session clusters for quick >> SQL >> > > > debugging internally, and found cold-start job submission slow due >> to >> > lack >> > > > of the exact minimum resource reservation feature proposed here. >> This >> > > > should improve the experience a lot for running short lived-jobs in >> > session >> > > > clusters. >> > > > >> > > > Best, >> > > > Zhanghao Chen >> > > > ________________________________ >> > > > 发件人: Yangze Guo <karma...@gmail.com> >> > > > 发送时间: 2023年9月19日 13:10 >> > > > 收件人: xiangyu feng <xiangyu...@gmail.com> >> > > > 抄送: dev@flink.apache.org <dev@flink.apache.org> >> > > > 主题: Re: [Discuss] FLIP-362: Support minimum resource limitation >> > > > >> > > > Thanks for driving this @Xiangyu. This is a feature that many users >> > > > have requested for a long time. +1 for the overall proposal. >> > > > >> > > > Best, >> > > > Yangze Guo >> > > > >> > > > On Tue, Sep 19, 2023 at 11:48 AM xiangyu feng <xiangyu...@gmail.com >> > >> > > > wrote: >> > > > > >> > > > > Hi Devs, >> > > > > >> > > > > I'm opening this thread to discuss FLIP-362: Support minimum >> resource >> > > > limitation. The design doc can be found at: >> > > > > FLIP-362: Support minimum resource limitation >> > > > > >> > > > > Currently, the Flink cluster only requests Task Managers (TMs) >> when >> > > > there is a resource requirement, and idle TMs are released after a >> > certain >> > > > period of time. However, in certain scenarios, such as running short >> > > > lived-jobs in session cluster and scheduling batch jobs stage by >> > stage, we >> > > > need to improve the efficiency of job execution by maintaining a >> > certain >> > > > number of available workers in the cluster all the time. >> > > > > >> > > > > After discussed with Yangze, we introduced this new feature. The >> new >> > > > added public options and proposed changes are described in this >> FLIP. >> > > > > >> > > > > Looking forward to your feedback, thanks. >> > > > > >> > > > > Best regards, >> > > > > Xiangyu >> > > > > >> > > > >> > >> >