Hi, everyone. Thanks for your feedback. If there are no more concerns or comments, I will start the vote tomorrow.
Gen Luo <luogen...@gmail.com> 于 2022年7月11日周一 11:12写道: > Hi Lijie and Zhu, > > Thanks for the suggestion. I agree that the name "Blocked Free Slots" is > more clear to users. > I'll take the suggestion and update the FLIP. > > On Fri, Jul 8, 2022 at 9:12 PM Zhu Zhu <reed...@gmail.com> wrote: > >> I agree that it can be more useful to show the number of slots that are >> free but blocked. Currently users infer the slots in use by subtracting >> available slots from the total slots. With blocked slots introduced, this >> can be achieved by subtracting available slots and blocked free slots >> from the total slots. >> >> Therefore, +1 to show "Blocked Free Slots" on the resource card. >> >> Thanks, >> Zhu >> >> Lijie Wang <wangdachui9...@gmail.com> 于2022年7月8日周五 17:39写道: >> > >> > Hi Gen & Zhu, >> > >> > -> 1. Can we also show "Blocked Slots" in the resource card, so that >> users >> > can easily figure out how many slots are available/blocked/in-use? >> > >> > I think we should describe the "available" and "blocked" more clearly. >> In >> > my opinion, I think users should be interested in the number of slots in >> > the following 3 state: >> > 1. free and unblocked, I think it's OK to call this state "available". >> > 2. free and blocked, I think it's not appropriate to call "blocked" >> > directly, because "blocked" should include both the "free and blocked" >> and >> > "in-use and blocked". >> > 3. in-use >> > >> > And the sum of the aboved 3 kind of slots should be the total number of >> > slots in this cluster. >> > >> > WDYT? >> > >> > Best, >> > Lijie >> > >> > Gen Luo <luogen...@gmail.com> 于2022年7月8日周五 16:14写道: >> > >> > > Hi Zhu, >> > > Thanks for the feedback! >> > > >> > > 1.Good idea. Users should be more familiar with the slots as the >> resource >> > > units. >> > > >> > > 2.You remind me that the "speculative attempts" are execution attempts >> > > started by the SpeculativeScheduler when slot tasks are detected, >> while the >> > > current execution attempts other than the "most current" one are not >> really >> > > the speculative attempts. I agree we should modify the field name. >> > > >> > > 3.ArchivedSpeculativeExecutionVertex seems to be introduced with the >> > > speculative execution to handle the speculative attempts as a part of >> the >> > > execution history. Since this FLIP is handling the attempts with a >> more >> > > proper way, I agree that we can remove the >> > > ArchivedSpeculativeExecutionVertex. >> > > >> > > Thanks again and I'll update the FLIP later according to these >> suggestions. >> > > >> > > On Thu, Jul 7, 2022 at 4:35 PM Zhu Zhu <reed...@gmail.com> wrote: >> > > >> > > > Thanks for writing this FLIP and initiating the discussion, Gen, >> Yun and >> > > > Junhan! >> > > > It will be very useful to have these improvements on the web UI for >> > > > speculative execution users, allowing them to know what is >> happening. >> > > > I just have a few comment regarding the design details: >> > > > >> > > > 1. Can we also show "Blocked Slots" in the resource card, so that >> users >> > > > can easily figure out how many slots are available/blocked/in-use? >> > > > 2. I think "speculative-attempts" is not accurate, because the >> > > > root/fastest current can be a specualtive execution attempt, and in >> > > > this case "speculative-attempts" will contain the intial execution >> > > > attempt. How about name it as "other-concurrent-attempts"? >> > > > 3. I think ArchivedSpeculativeExecutionVertex is not necessarily >> > > > needed. We can rework the ArchivedExecutionVertex to contains a set >> of >> > > > current execution attempts. The set will have one only element in >> > > > non-speculative cases though. In this way, we can have a unified >> > > > processing for ArchivedExecutionVertex in >> speculative/non-speculative >> > > > cases. >> > > > >> > > > Thanks, >> > > > Zhu >> > > > >> > > > Gen Luo <luogen...@gmail.com> 于2022年7月5日周二 15:10写道: >> > > > >> > > > > >> > > > > Hi everyone, >> > > > > >> > > > > The speculative execution for batch jobs has been proposed and >> accepted >> > > > in >> > > > > FLIP-168[1], as well as the related blocklist mechanism in >> FLIP-224[2]. >> > > > As >> > > > > a follow-up step, the Flink Web UI needs to be enhanced to >> display the >> > > > > related information if the speculative execution mechanism is >> enabled. >> > > > > >> > > > > Junhan Yang, Yun Gao and I would like to start the discussion >> about the >> > > > Web >> > > > > UI enhancement and the corresponding REST API changes in >> FLIP-249[3], >> > > > > including: >> > > > > - show the speculative executions in the subtask list and the >> > > > backpressure >> > > > > page, where the fastest is shown directly while others are folded; >> > > > > - show the number of the blocked task managers in the Task >> Managers and >> > > > > Slots card, when the number is not 0; >> > > > > - show the BLOCKED label in the task manager list and the task >> manager >> > > > > detail page for the blocked task managers. >> > > > > >> > > > > All changes expect to be transparent to users who don’t use >> speculative >> > > > > execution. >> > > > > >> > > > > Please see the FLIP page[3] for more details. Looking forward to >> your >> > > > > feedback. >> > > > > >> > > > > [1] >> > > > > >> > > > >> > > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-168%3A+Speculative+Execution+for+Batch+Job >> > > > > [2] >> > > > > >> > > > >> > > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-224%3A+Blocklist+Mechanism >> > > > > [3] >> > > > > >> > > > >> > > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-249%3A+Flink+Web+UI+Enhancement+for+Speculative+Execution >> > > > >> > > >> >