Hi all Thanks to everyone involved in the discussion
I have started the voting thread for all the sub FLIPs (including FLIP-98, FLIP-99, FLIP-100, FLIP-101, FLIP-102, FLIP-103, and FLIP-104) Please take some time to vote on the FLIP you are interested in, thanks lining jing <jinglini...@gmail.com> 于2020年2月19日周三 下午3:54写道: > I think we can create the PR for the GC status later if we could find an > easy way to obtain it, before that the users could get GC logs from the > FLIP-103. > > By the way, there is a similar topic 'FlameGraph In Job Vertex' in FLIP-75 > in the early discussion stage > > https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#heading=h.6xcek9hyxzu0 > , > we move it into another FLIP to discuss later since FLIP-75 is heavy > enough. > > And I have created FLINK-15422 > <https://issues.apache.org/jira/browse/FLINK-15422> to get more metrics > for > JVM's memory. > Piotr Nowojski <pi...@ververica.com> 于2020年2月18日周二 下午7:11写道: > > > Hi, > > > > A quick question/comment about FLIP-102. Have you thought about adding GC > > stats? I’m not sure what’s easily do-able, but something that would allow > > user to see GC issues (long/frequent pauses, lots of CPU time spent in > the > > GC) would be quite useful for analysing performance/stability issues, > > without a need of connecting profilers in a distributed environment? > > > > Piotrek > > > > > On 10 Feb 2020, at 10:58, Yadong Xie <vthink...@gmail.com> wrote: > > > > > > Hi all > > > I have drafted the docs of top-level FLIPs for the individual changes > > > proposed in FLIP-75. > > > will update it to the cwiki page and start the voting stage soon if > there > > > is no objection. > > > > > > - FLIP-98: Better Back Pressure Detection > > > < > > > https://docs.google.com/document/d/1b4GadCze-36x5TPHz6ie4WI9fOUuxWoT_rWWgeg68oo/edit?usp=sharing > > > > > > - FLIP-99: Make Max Exception Configurable > > > < > > > https://docs.google.com/document/d/1tsPpTEx5WqliOAUC924xzRxYOalUuB-GoznGPcxSzJo/edit?usp=sharing > > > > > > - FLIP-100: Add Attempt Information > > > < > > > https://docs.google.com/document/d/1Ww7biOr6WMVfoYhtBTJftRqEm9FGo33AXgYibdXy47Y/edit?usp=sharing > > > > > > - FLIP-101: Add Pending Slots Tab in Job Detail > > > < > > > https://docs.google.com/document/d/1ttn7zIn_Z237JOHdmhiei6aCwKdjTU53I07XxA61Fis/edit?usp=sharing > > > > > > - FLIP-102: Add More Metrics to TaskManager > > > < > > > https://docs.google.com/document/d/18yHdsqUJ1FmNRm0hyeCm3nWvPFpvpJgTJ8BYNAa6Ul8/edit?usp=sharing > > > > > > - FLIP-103: Better Taskmanager Log Display > > > < > > > https://docs.google.com/document/d/16eEdW2KeLxvABdoXahx4MMMisW4_P9mKiqUE0F4GO1c/edit?usp=sharing > > > > > > - FLIP-104: Add More Metrics to Jobmanager > > > < > > > https://docs.google.com/document/d/1Fak632iOroOLZFADqwZWu2SS-LUQqCLHnm8Vs3XM5to/edit?usp=sharing > > > > > > - FLIP-105: Better Jobmanager Log Display > > > < > > > https://docs.google.com/document/d/1ayXaZflelaymQuF3l6UOuGEg6zzbSGGewoDq-9SBOPY/edit?usp=sharing > > > > > > > > > > > > Yadong Xie <vthink...@gmail.com> 于2020年2月9日周日 下午7:24写道: > > > > > >> Hi Till > > >> I got your point, will create sub FLIPs and votings according to the > > >> FLIP-75 and previous discussion soon. > > >> > > >> Till Rohrmann <trohrm...@apache.org> 于2020年2月9日周日 下午5:27写道: > > >> > > >>> Hi Yadong, > > >>> > > >>> I think it would be fine to simply link to this discussion thread to > > keep > > >>> the discussion history. Maybe an easier way would be to create > > top-level > > >>> FLIPs for the individual changes proposed in FLIP-75. The reason I'm > > >>> proposing this is that it would be easier to vote on it and to > > implement > > >>> it > > >>> because the scope is smaller. But maybe I'm wrong here and others > could > > >>> chime in to voice their opinion. > > >>> > > >>> Cheers, > > >>> Till > > >>> > > >>> On Fri, Feb 7, 2020 at 9:58 AM Yadong Xie <vthink...@gmail.com> > wrote: > > >>> > > >>>> Hi Till > > >>>> > > >>>> FLIP-75 has been open since September, and the design doc has been > > >>> iterated > > >>>> over 3 versions and more than 20 patches. > > >>>> I had a try, but it is hard to split the design docs into sub FLIP > and > > >>> keep > > >>>> all the discussion history at the same time. > > >>>> > > >>>> Maybe it is better to start another discussion to talk about the > > >>> individual > > >>>> sub FLIP voting? and make the next FLIP follow the new practice if > > >>>> possible. > > >>>> > > >>>> Till Rohrmann <trohrm...@apache.org> 于2020年2月3日周一 下午6:28写道: > > >>>> > > >>>>> I think there is no such description because we never did it > before. > > I > > >>>> just > > >>>>> figured that FLIP-75 could actually be a good candidate to start > this > > >>>>> practice. We would need a community discussion first, though. > > >>>>> > > >>>>> Cheers, > > >>>>> Till > > >>>>> > > >>>>> On Mon, Feb 3, 2020 at 10:28 AM Yadong Xie <vthink...@gmail.com> > > >>> wrote: > > >>>>> > > >>>>>> Hi Till > > >>>>>> I didn’t find how to create of sub flip at cwiki.apache.org > > >>>>>> do you mean to create 9 more FLIPS instead of FLIP-75? > > >>>>>> > > >>>>>> Till Rohrmann <trohrm...@apache.org> 于2020年1月30日周四 下午11:12写道: > > >>>>>> > > >>>>>>> Would it be easier if FLIP-75 would be the umbrella FLIP and we > > >>> would > > >>>>>> vote > > >>>>>>> on the individual improvements as sub FLIPs? Decreasing the scope > > >>>>> should > > >>>>>>> make things easier. > > >>>>>>> > > >>>>>>> Cheers, > > >>>>>>> Till > > >>>>>>> > > >>>>>>> On Thu, Jan 30, 2020 at 2:35 PM Robert Metzger < > > >>> rmetz...@apache.org> > > >>>>>>> wrote: > > >>>>>>> > > >>>>>>>> Thanks a lot for this work! I believe the web UI is very > > >>> important, > > >>>>> in > > >>>>>>>> particular to new users. I'm very happy to see that you are > > >>> putting > > >>>>>>> effort > > >>>>>>>> into improving the visibility into Flink through the proposed > > >>>>> changes. > > >>>>>>>> > > >>>>>>>> I can not judge if all the changes make total sense, but the > > >>>>> discussion > > >>>>>>> has > > >>>>>>>> been open since September, and a good number of people have > > >>>> commented > > >>>>>> in > > >>>>>>>> the document. > > >>>>>>>> I wonder if we can move this FLIP to the VOTing stage? > > >>>>>>>> > > >>>>>>>> On Wed, Jan 22, 2020 at 6:27 PM Till Rohrmann < > > >>>> trohrm...@apache.org> > > >>>>>>>> wrote: > > >>>>>>>> > > >>>>>>>>> Thanks for the update Yadong. Big +1 for the proposed > > >>>> improvements > > >>>>>> for > > >>>>>>>>> Flink's web UI. I think they will be super helpful for our > > >>> users. > > >>>>>>>>> > > >>>>>>>>> Cheers, > > >>>>>>>>> Till > > >>>>>>>>> > > >>>>>>>>> On Tue, Jan 7, 2020 at 10:00 AM Yadong Xie < > > >>> vthink...@gmail.com> > > >>>>>>> wrote: > > >>>>>>>>> > > >>>>>>>>>> Hi everyone > > >>>>>>>>>> > > >>>>>>>>>> We have spent some time updating the documentation since the > > >>>> last > > >>>>>>>>>> discussion. > > >>>>>>>>>> > > >>>>>>>>>> In short, the latest FLIP-75 contains the following > > >>>>>>> proposal(including > > >>>>>>>>> both > > >>>>>>>>>> frontend and RestAPI) > > >>>>>>>>>> > > >>>>>>>>>> 1. Job Level > > >>>>>>>>>> - better job backpressure detection > > >>>>>>>>>> - load more feature in job exception > > >>>>>>>>>> - show attempt history in the subtask > > >>>>>>>>>> - show attempt timeline > > >>>>>>>>>> - add pending slots > > >>>>>>>>>> 2. Task Manager Level > > >>>>>>>>>> - add more metrics > > >>>>>>>>>> - better log display > > >>>>>>>>>> 3. Job Manager Level > > >>>>>>>>>> - add metrics tab > > >>>>>>>>>> - better log display > > >>>>>>>>>> > > >>>>>>>>>> To help everyone better understand the proposal, we spent > > >>>> efforts > > >>>>>> on > > >>>>>>>>> making > > >>>>>>>>>> an online POC <http://101.132.122.69:8081/web/#/overview>. > > >>>>>>>>>> > > >>>>>>>>>> Now you can compare the difference between the new and old > > >>>>>>> Web/RestAPI > > >>>>>>>>> (the > > >>>>>>>>>> link is inside the doc)! > > >>>>>>>>>> > > >>>>>>>>>> Here is the latest FLIP-75 doc: > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>>> > > >>>>>> > > >>>>> > > >>>> > > >>> > > > https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit# > > >>>>>>>>>> > > >>>>>>>>>> Looking forward to your feedback > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> Best, > > >>>>>>>>>> Yadong > > >>>>>>>>>> > > >>>>>>>>>> lining jing <jinglini...@gmail.com> 于2019年10月24日周四 > > >>> 下午2:11写道: > > >>>>>>>>>> > > >>>>>>>>>>> Hi all, I have updated the backend design in FLIP-75 > > >>>>>>>>>>> < > > >>>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>>> > > >>>>>> > > >>>>> > > >>>> > > >>> > > > https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing > > >>>>>>>>>>>> > > >>>>>>>>>>> . > > >>>>>>>>>>> > > >>>>>>>>>>> Here are some brief introductions: > > >>>>>>>>>>> > > >>>>>>>>>>> - Add metric for manage memory FLINK-14406 > > >>>>>>>>>>> <https://issues.apache.org/jira/browse/FLINK-14406>. > > >>>>>>>>>>> - Expose TaskExecutor resource configurations to REST > > >>> API > > >>>>>>>>> FLINK-14422 > > >>>>>>>>>>> <https://issues.apache.org/jira/browse/FLINK-14422>. > > >>>>>>>>>>> - Add TaskManagerResourceInfo in > > >>> TaskManagerDetailsInfo to > > >>>>>> show > > >>>>>>>>>>> TaskManager Resource FLINK-14435 > > >>>>>>>>>>> <https://issues.apache.org/jira/browse/FLINK-14435>. > > >>>>>>>>>>> > > >>>>>>>>>>> I will continue to update the rest part of the backend > > >>> design > > >>>>> in > > >>>>>>> the > > >>>>>>>>> doc, > > >>>>>>>>>>> let's keep discuss here, any feedback is appreciated. > > >>>>>>>>>>> > > >>>>>>>>>>> Yadong Xie <vthink...@gmail.com> 于2019年9月27日周五 上午10:13写道: > > >>>>>>>>>>> > > >>>>>>>>>>>> Hi all > > >>>>>>>>>>>> > > >>>>>>>>>>>> Flink Web UI is the main platform for most users to > > >>> monitor > > >>>>>> their > > >>>>>>>>> jobs > > >>>>>>>>>>> and > > >>>>>>>>>>>> clusters. We have reconstructed Flink web in 1.9.0 > > >>> version, > > >>>>> but > > >>>>>>>> there > > >>>>>>>>>> are > > >>>>>>>>>>>> still some shortcomings. > > >>>>>>>>>>>> > > >>>>>>>>>>>> This discussion thread aims to provide a better > > >>> experience > > >>>>> for > > >>>>>>>> Flink > > >>>>>>>>> UI > > >>>>>>>>>>>> users. > > >>>>>>>>>>>> > > >>>>>>>>>>>> Here is the design doc I drafted: > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>>> > > >>>>>> > > >>>>> > > >>>> > > >>> > > > https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>>> The FLIP can be found at [2]. > > >>>>>>>>>>>> > > >>>>>>>>>>>> Please keep the discussion here, in the mailing list. > > >>>>>>>>>>>> > > >>>>>>>>>>>> Looking forward to your opinions, any feedbacks are > > >>>> welcome. > > >>>>>>>>>>>> > > >>>>>>>>>>>> [1]: > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>>> > > >>>>>> > > >>>>> > > >>>> > > >>> > > > https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing > > >>>>>>>>>>>> < > > >>>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>>> > > >>>>>> > > >>>>> > > >>>> > > >>> > > > https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit# > > >>>>>>>>>>>>> > > >>>>>>>>>>>> [2]: > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>>> > > >>>>>> > > >>>>> > > >>>> > > >>> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal > > >>>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>>> > > >>>>>> > > >>>>> > > >>>> > > >>> > > >> > > > > >