Hi, Matthias, So sorry for the long delay in completing and following up. I'm still making the necessary research to address the issues mentioned previously in this FLIP.
FYI, the new round of adjustments has been completed, and I've attached an overview of a few issues. > Also keep in mind that it's not only the JobResourceRequirements update that > can trigger a rescale. > Updating the available resources through newResourcesAvailable [2] can also > initiate a rescale. > I don't see this being clearly laid out in FLIP-495, yet. Thanks for the reminding. I added it in the section of ‘Current core adaptive scheduler state transition about rescale’[1] > I'm wondering whether that could be even considered a bug that manifests > itself > if the stopWithSavepoint mechanism fails and the job is executed. Sorry, Matthias, I am not very familiar with the original design intentions of this part. However, from the current perspective, since the current behavior is allowed and the community has not received any risk-related feedback, it seems acceptable to not address it in the current FLIP. Of course, if we now define it as an unintended behavior, I think that would also be reasonable, because, if I remember correctly, this aligns with the semantics of the DefaultScheduler for streaming jobs. > It might be useful to make this clearer in the FLIP as well (i.e. adding more > context aside from the bullet points). Thanks Matthias for the reminder. I have added some perspectives in the Rescale state definition section and stated some reasons. [2] > I agree that we might want to have this information being stored in DFS. > That way the information would survive a JobManager failover. > It would be still nice to have all three options being reflected in the FLIP, > though, with the pro's and con's having the feature properly documented. Thanks Matthias for the reminder and sharing. I have updated the corresponding document in the About rescale events storage section.[3] > I also want to point out that we have four different notions of resource > configurations: > Desired resources: The ideal resource configuration that we want to achieve > for a job if > enough Task slots are available (essentially the upper bound of the job's > parallelism) > Sufficient resources: A minimum resource configuration that the job can run > on (the lower bound of the job's parallelism) > Current resources: The resource configuration the job runs on before rescaling > Follow-up resources > The first two are the resource configurations the rescale decision is based > on. > The last two are the actual applied resource configurations. > Keep in mind that the latter two are not necessarily matching the resource > configurations > that were considered when deciding on the rescaling. > Especially the case where the desired resources were met when rescaling was > triggered but where task slots are lost while rescaling can have a surprising > outcome. > We might want to have this reflected in the rescale event. Thank you Matthias for your sharing! This is very reasonable and will enhance the readability of the information as well as its usability for users. I have made some adjustments in the corresponding part[4]. Any suggestions or comments would be greatly appreciated! Best regards, Yuepeng. [1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=334760525#id-[WIP]FLIP495:SupportAdaptiveSchedulerrecordandquerytherescalehistory-Currentcoreadaptiveschedulerstatetransitionaboutrescale [2] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=334760525#id-[WIP]FLIP495:SupportAdaptiveSchedulerrecordandquerytherescalehistory-Rescalestatus [3] https://cwiki.apache.org/confluence/display/FLINK/%5BWIP%5D+FLIP-495%3A+Support+AdaptiveScheduler+record+and+query+the+rescale+history#id-[WIP]FLIP495:SupportAdaptiveSchedulerrecordandquerytherescalehistory-Aboutrescaleeventsstorage [4] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=334760525#id-[WIP]FLIP495:SupportAdaptiveSchedulerrecordandquerytherescalehistory-Slots/Resources On 2025/01/13 07:53:25 Matthias Pohl wrote: > Hi Yupeng, > I managed to find some time to respond. See my answers inlined below. > > Matthias > > On Fri, Jan 3, 2025 at 11:48 AM Yuepeng Pan <panyuep...@apache.org> wrote: > > > [...] > > Sorry for not expressing this part clearly earlier. > > IIUC, based on the Adaptive Scheduler state diagram [1], > > when a stop-with-savepoint operation fails but the job is restartable, > > it transitions to the WaitingForResources state, which implies a > > potential rescaling process may trigger. > > > > Yes, you're right. That's a scenario that should be covered by the FLIP but > doesn't include having the Executing state as an initial AdaptiveScheduler > state. That's also a corner case which is not covered by the current > implementation. ...at least we're not considering any job resource > requirement updates during StopWithSavepoint. > > I'm wondering whether that could be even considered a bug that manifests > itself if the stopWithSavepoint mechanism fails and the job is executed. > > > > From the current logic, a rescale event may be triggered under the > > following circumstances, called 'rescale triggers' here: > > - updateJobResourceRequirement > > - Restart due to recoverable failure > > - newResourceAvailable > > > > From Fig [1], the entire chain of rescale-related scheduler > > states typically involves several loops: > > - WaitingForResources -> CreatingExecutionGraph -> Executing -> > > StopWithSavepoint (error & restartable) -> Restarting -> WaitingForResources > > - CreatingExecutionGraph [-> WaitingForResources -> > > CreatingExecutionGraph](optinal loop ) -> Executing(rescale triggers) -> > > Restarting -> CreatingExecutionGraph > > > > Following your shared ideas: > Based on what I pointed out in my previous > > post, I would think that a rescale operation has its > starting point in > > the AdaptiveScheduler's Executing state (i.e. when the job is running). I > > attempted to interpret the boundary definition of rescale events. > > > > The historical states of the Adaptive Scheduler during amsuccessful > > rescale event would likely match one of the following patterns ? > > - (Starting) Executing(rescale triggers) -> Restarting -> > > CreatingExecutionGraph [ -> WaitingForResources -> CreatingExecutionGraph] > > -> (Ending) Executing > > - (Starting) Executing -> StopWithSavepoint (error & restartable) -> > > Restarting -> WaitingForResources -> CreatingExecutionGraph [ -> > > WaitingForResources -> CreatingExecutionGraph] -> (Ending) Executing > > > > If my understanding is incorrect, please feel free to correct me. > > > > No, I see your point now. I was too focused on the Executing state. But > you're right: WaitingForRescale can also be part of error handling that > happened while Executing the job or trying to stop the job (i.e. > StopWithSavepoint). These cases need to be considered when collecting the > rescale events. Sorry for not grasping your intention earlier and thanks > for your clarification. > > It might be useful to make this clearer in the FLIP as well (i.e. adding > more context aside from the bullet points). > > 2 - About the status of a rescale event. > > > - Rescale status: I would say that this section needs to be reworked. > > Can I understand that Rescale Event does indeed need some status > > information, > > such as FAILED, IGNORED, SUCCESS, and PENDING, etc, > > to indicate the final status of a completed event or the current status of > > an ongoing event? > > In other words, the current status fields and their associations are > > unreasonable, > > so its need to be redesigned rather than discarding this descriptive > > mechanism. > > > > Sounds reasonable. > > > > [1] > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=334760525#id-[WIP]FLIP495:SupportAdaptiveSchedulerrecordandquerytherescalehistory-CurrentadaptiveschedulerstatetransitionFig > > . > > > > Thank you. > > > > Best, > > Yuepeng > > >