Hi Chesnay, thanks for your input, > That seems like something we'd maybe want to introduce consistently for all checkpoint-related endpoints.
Do you believe we should scope it out of this FLIP and follow up with another FLIP, also I only see it might impact the /jobs/:jobid/checkpoints endpoint since the other endpoints (details and details with vertices) work on a specific checkpoint. > I'm also not sure about returning a 404 if no checkpoints exists (especially with the filtering) but the job is there. yeah I see your point, can do the "latest:{} or latest:{...checkpoint info...}" suggestion. will only need to wrap the details. > The FLIP should also cover the error cases when it is called for jobs that don't have checkpointing enabled (e.g., batch). will add to the FLIP but I guess we should just go with a 400 response Best Regards Ahmed Hamdy On Thu, 24 Jul 2025 at 15:03, Chesnay Schepler <ches...@apache.org> wrote: > I think the idea of filtering is interesting but I do wonder if we > should introduce it as part of this FLIP. > That seems like something we'd maybe want to introduce consistently for > all checkpoint-related endpoints. > > I'm also not sure about returning a 404 if no checkpoints exists > (especially with the filtering) but the job is there. > It's a bit annoying to handle on the client-side, especially since there > are other 404 causes, and it can spuriously happen despite no issue on > the client side (e.g., when the job is still initializing, or just > started, or the JM has restarted and lost the checkpoint history (I'm > not sure if the checkpoint we restore from is included in there). > As an alternative it could be either latest:{} or latest:{...checkpoint > info...} > > The FLIP should also cover the error cases when it is called for jobs > that don't have checkpointing enabled (e.g., batch). > > On 22/07/2025 06:35, Ahmed Hamdy wrote: > > Hi Poorvank > > yes the idea is to do the latest checkpoint Id lookup from the history > and > > use it to return the checkpoint details. > > > >> Possible to consider adding type (savepoint/checkpoint) > > filtering: Since cache returns AbstractCheckpointStats > > > > yeah that's a good idea, I believe it might be useful in some cases. > > Best Regards > > Ahmed Hamdy > > > > > > On Fri, 18 Jul 2025 at 20:39, Poorvank Bhatia <puravbhat...@gmail.com> > > wrote: > > > >> Hi Ahmed, Thank you for the FLIP. > >> +1 (non-binding) for this feature. > >> > >> I have two implementation questions: > >> > >> 1. Approach for finding latest checkpoints: Since the FLIP > >> mentions "utilizing > >> existing CheckpointStatsCache, > >> < > >> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=373886441#FLIP536:AddlatestcheckpointdetailsendpointtoRestAPI-ImplementationDetails > >>> " > >> but that cache only supports lookup by checkpoint ID (tryGet(long > >> checkpointId)) > >> < > >> > https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/rest/handler/job/checkpoints/CheckpointStatsCache.java#L71 > >>> , > >> do you intend to use getLatestCompletedCheckpoint() > >> < > >> > https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointStatsHistory.java#L147 > >> to > >> find the latest checkpoint, then cache it using > >> checkpointStatsCache.tryAdd(). Is this the intended approach, if > not can > >> you clarify more. > >> 2. Possible to consider adding type (savepoint/checkpoint) > >> filtering: Since cache returns AbstractCheckpointStats > >> < > >> > https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/AbstractCheckpointStats.java > >> which > >> has CheckpointProperties > >> < > >> > https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointProperties.java > >> that > >> can distinguish between regular checkpoints and savepoints, would > it be > >> valuable to extend the endpoint to support type filtering? i.e > >> > >> GET > >> /jobs/:jobid/checkpoints/details/latest?status=COMPLETED&type=SAVEPOINT > >> > >> On Fri, Jul 18, 2025 at 9:48 PM Ahmed Hamdy <hamdy10...@gmail.com> > wrote: > >> > >>> Hi David, > >>> Thanks for the feedback, I guess an alternative approach would be > adding > >>> paging and sorting to the checkpointing stats query, however this will > >>> still require 2 REST api calls to get the latest checkpoint details as > >> the > >>> stats endpoint only gives a summary not the details, I am open to > adding > >>> another query parameter to the endpoint in the FLIP to get latest X > >>> checkpoint details in one go but I honestly didn't see much of a use > case > >>> to have more than one and might complicate how we wanna handle having y > >>> available checkpoints where 0 < y < X. > >>> Let me know your thoughts as well as the rest of the community. > >>> > >>> > >>> Best Regards > >>> Ahmed Hamdy > >>> > >>> > >>> On Fri, 18 Jul 2025 at 16:47, David Radley <david_rad...@uk.ibm.com> > >>> wrote: > >>> > >>>> Hi Ahmed, > >>>> Thanks for submitting this Flip. > >>>> What do you think of having /jobs/:jobid/checkpoints with query params > >> to > >>>> specify sorted criteria and direction and the number of returned > >> elements > >>>> (page size). This would appear to be more of a standard (and flexible) > >>> way > >>>> of doing a search. To get the latest you would specify a page size of > 1 > >>>> with a time sort criteria and descending direction. > >>>> WDYT? > >>>> Warm regards, David. > >>>> > >>>> > >>>> From: Ahmed Hamdy <hamdy10...@gmail.com> > >>>> Date: Friday, 18 July 2025 at 15:48 > >>>> To: dev@flink.apache.org <dev@flink.apache.org> > >>>> Subject: [EXTERNAL] [DISCUSS][FLIP-536] Add latest checkpoint details > >>>> endpoint to Rest API > >>>> Hi Devs, > >>>> I would like to start a discussion on FLIP-536[1] for adding a > "latest" > >>>> checkpoint details endpoint to Flink's REST Api. This is a common case > >> I > >>>> have personally encountered when integrating components with Flink > >> using > >>>> the Rest API. > >>>> Let me know your thoughts. > >>>> > >>>> > >>>> 1- > >>>> > >>>> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-536%3A+Add+latest+checkpoint+details+endpoint+to+Rest+API > >>>> Best Regards > >>>> Ahmed Hamdy > >>>> > >>>> Unless otherwise stated above: > >>>> > >>>> IBM United Kingdom Limited > >>>> Registered in England and Wales with number 741598 > >>>> Registered office: Building C, IBM Hursley Office, Hursley Park Road, > >>>> Winchester, Hampshire SO21 2JN > >>>> > >