Re: [DISCUSS] FLIP-291: Externalized Declarative Resource Management

ConradJam Fri, 03 Feb 2023 01:31:28 -0800

Hi David:

Thank you for drive this flip, which helps less flink shutdown time


for this flip, I would like to make a few idea on share


   - when the number of "slots" is insufficient, can we can stop users
   rescaling or throw something to tell user "less avaliable slots to upgrade,
   please checkout your alivalbe slots" ? Or we could have a request
   switch(true/false) to allow this behavior


   - when user upgrade job-vertx-parallelism . I want to have an interface
   to query the current update parallel execution status, so that the user or
   program can understand the current status
   - I want to have an interface to query the current update parallelism
   execution status. This also helps similar to *[1] Flink K8S Operator*
   management


{
  status: Failed
  reason: "less avaliable slots to upgrade, please checkout your alivalbe slots"
}



   - *Pending*: this job now is join the upgrade queue,it will be update
   later
   - *Rescaling*: job now is rescaling,wait it finish
   - *Finished*: finish do it
   - *Failed* : something have wrong,so this job is not alivable upgrade

I want to supplement my above content in flip, what do you think ?


   1.
   https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/


David Morávek <[email protected]> 于2023年2月3日周五 16:42写道：

> Hi everyone,
>
> This FLIP [1] introduces a new REST API for declaring resource requirements
> for the Adaptive Scheduler. There seems to be a clear need for this API
> based on the discussion on the "Reworking the Rescale API" [2] thread.
>
> Before we get started, this work is heavily based on the prototype [3]
> created by Till Rohrmann, and the FLIP is being published with his consent.
> Big shoutout to him!
>
> Last and not least, thanks to Chesnay and Roman for the initial reviews and
> discussions.
>
> The best start would be watching a short demo [4] that I've recorded, which
> illustrates newly added capabilities (rescaling the running job, handing
> back resources to the RM, and session cluster support).
>
> The intuition behind the FLIP is being able to define resource requirements
> ("resource boundaries") externally that the AdaptiveScheduler can navigate
> within. This is a building block for higher-level efforts such as an
> external Autoscaler. The natural extension of this work would be to allow
> to specify per-vertex ResourceProfiles.
>
> Looking forward to your thoughts; any feedback is appreciated!
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-291%3A+Externalized+Declarative+Resource+Management
> [2] https://lists.apache.org/thread/2f7dgr88xtbmsohtr0f6wmsvw8sw04f5
> [3] https://github.com/tillrohrmann/flink/tree/autoscaling
> [4] https://drive.google.com/file/d/1Vp8W-7Zk_iKXPTAiBT-eLPmCMd_I57Ty/view
>
> Best,
> D.
>


-- 
Best

ConradJam

Re: [DISCUSS] FLIP-291: Externalized Declarative Resource Management

Reply via email to