Hi Zhu,
Thanks for the feedback!

1.Good idea. Users should be more familiar with the slots as the resource
units.

2.You remind me that the "speculative attempts" are execution attempts
started by the SpeculativeScheduler when slot tasks are detected, while the
current execution attempts other than the "most current" one are not really
the speculative attempts. I agree we should modify the field name.

3.ArchivedSpeculativeExecutionVertex seems to be introduced with the
speculative execution to handle the speculative attempts as a part of the
execution history. Since this FLIP is handling the attempts with a more
proper way, I agree that we can remove the
ArchivedSpeculativeExecutionVertex.

Thanks again and I'll update the FLIP later according to these suggestions.

On Thu, Jul 7, 2022 at 4:35 PM Zhu Zhu <reed...@gmail.com> wrote:

> Thanks for writing this FLIP and initiating the discussion, Gen, Yun and
> Junhan!
> It will be very useful to have these improvements on the web UI for
> speculative execution users, allowing them to know what is happening.
> I just have a few comment regarding the design details:
>
> 1. Can we also show "Blocked Slots" in the resource card, so that users
> can easily figure out how many slots are available/blocked/in-use?
> 2. I think "speculative-attempts" is not accurate, because the
> root/fastest current can be a specualtive execution attempt, and in
> this case "speculative-attempts" will contain the intial execution
> attempt. How about name it as "other-concurrent-attempts"?
> 3. I think ArchivedSpeculativeExecutionVertex is not necessarily
> needed. We can rework the ArchivedExecutionVertex to contains a set of
> current execution attempts. The set will have one only element in
> non-speculative cases though. In this way, we can have a unified
> processing for ArchivedExecutionVertex in speculative/non-speculative
> cases.
>
> Thanks,
> Zhu
>
> Gen Luo <luogen...@gmail.com> 于2022年7月5日周二 15:10写道:
>
> >
> > Hi everyone,
> >
> > The speculative execution for batch jobs has been proposed and accepted
> in
> > FLIP-168[1], as well as the related blocklist mechanism in FLIP-224[2].
> As
> > a follow-up step, the Flink Web UI needs to be enhanced to display the
> > related information if the speculative execution mechanism is enabled.
> >
> > Junhan Yang, Yun Gao and I would like to start the discussion about the
> Web
> > UI enhancement and the corresponding REST API changes in FLIP-249[3],
> > including:
> > - show the speculative executions in the subtask list and the
> backpressure
> > page, where the fastest is shown directly while others are folded;
> > - show the number of the blocked task managers in the Task Managers and
> > Slots card, when the number is not 0;
> > - show the BLOCKED label in the task manager list and the task manager
> > detail page for the blocked task managers.
> >
> > All changes expect to be transparent to users who don’t use speculative
> > execution.
> >
> > Please see the FLIP page[3] for more details. Looking forward to your
> > feedback.
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-168%3A+Speculative+Execution+for+Batch+Job
> > [2]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-224%3A+Blocklist+Mechanism
> > [3]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-249%3A+Flink+Web+UI+Enhancement+for+Speculative+Execution
>

Reply via email to