Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Zhu Zhu Thu, 21 Nov 2019 08:44:58 -0800

Thanks Aihua for the explanation.
The proposal looks good to me then.

Thanks,
Zhu Zhu


aihua li <liaihua1...@gmail.com> 于2019年11月21日周四 下午3:59写道：

> Thanks for the comments Zhu Zhu!
>
> > 1. How do we measure the job throughput? By measuring the job execution
> > time on a finite input data set, or measuring the QPS when the job has
> > reached a stable state?
> >    I ask this because that, with LazyFromSource schedule mode, tasks are
> > launched gradually on processing progress.
> >    So if we are measuring the throughput in the latter way,
> > the LazyFromSource scheduling would make no difference with Eager
> > scheduling. So we can drop this dimension if taking this way.
> >   By measuring the total execution time, however, it can be kept since
> the
> > scheduling effectiveness can make differences, especially in small input
> > data set cases.
>
> we plan to meaure the job throughout by meauring the qps when the job has
> reached a stable state.
> If as you said, there is no difference between lazyfromsource and eager in
> this measuring way, we can adjust the test scenario after running for a
> while, and remove the duplicate part.
>
> > 2. In our prior experiences, the performance result is usually not that
> > stable, which may make the perf degradation harder to detect.
> >   Shall we define the rounds to run a job and how to aggregate the
> > result,  so that we can get a more reliable final performance result?
>
> Good advice, we plan to run multi rounds(5 is the default value ) per
> scene ,then calculate the average value as the result.
>
>
>
>
>
> > 在 2019年11月21日，下午3:01，Zhu Zhu <reed...@gmail.com> 写道：
> >
> > Thanks Yu for bringing up this discussion.
> > The e2e perf tests can be really helpful and the overall design looks
> good
> > to me.
> >
> > Sorry it's late but I have 2 questions about the result check.
> > 1. How do we measure the job throughput? By measuring the job execution
> > time on a finite input data set, or measuring the QPS when the job has
> > reached a stable state?
> >    I ask this because that, with LazyFromSource schedule mode, tasks are
> > launched gradually on processing progress.
> >    So if we are measuring the throughput in the latter way,
> > the LazyFromSource scheduling would make no difference with Eager
> > scheduling. So we can drop this dimension if taking this way.
> >   By measuring the total execution time, however, it can be kept since
> the
> > scheduling effectiveness can make differences, especially in small input
> > data set cases.
> > 2. In our prior experiences, the performance result is usually not that
> > stable, which may make the perf degradation harder to detect.
> >   Shall we define the rounds to run a job and how to aggregate the
> > result,  so that we can get a more reliable final performance result?
> >
> > Thanks,
> > Zhu Zhu
> >
> > Yu Li <car...@gmail.com> 于2019年11月14日周四 上午10:52写道：
> >
> >> Since one week passed and no more comments, I assume the latest FLIP doc
> >> looks good to all and will open a VOTE thread soon for the FLIP. Thanks
> for
> >> all the comments and discussion!
> >>
> >> Best Regards,
> >> Yu
> >>
> >>
> >> On Thu, 7 Nov 2019 at 18:35, Yu Li <car...@gmail.com> wrote:
> >>
> >>> Thanks for the comments Biao!
> >>>
> >>> bq. It seems this proposal is separated into several stages. Is there a
> >>> more detailed plan?
> >>> Good point! For stage one we'd like to try introducing the benchmark
> >>> first, so we could guard the release (hopefully starting from 1.10).
> For
> >>> other stages, we don't have detailed plan yet, but will add child FLIPs
> >>> when moving on and open new discussion/voting separately. I have
> updated
> >>> the FLIP document to better reflect this, please check it and let me
> know
> >>> what you think. Thanks.
> >>>
> >>> Best Regards,
> >>> Yu
> >>>
> >>>
> >>> On Tue, 5 Nov 2019 at 10:16, Biao Liu <mmyy1...@gmail.com> wrote:
> >>>
> >>>> Thanks Yu for bringing this topic.
> >>>>
> >>>> +1 for this proposal. Glad to have an e2e performance testing.
> >>>>
> >>>> It seems this proposal is separated into several stages. Is there a
> more
> >>>> detailed plan?
> >>>>
> >>>> Thanks,
> >>>> Biao /'bɪ.aʊ/
> >>>>
> >>>>
> >>>>
> >>>> On Mon, 4 Nov 2019 at 19:54, Congxian Qiu <qcx978132...@gmail.com>
> >> wrote:
> >>>>
> >>>>> +1 for this idea.
> >>>>>
> >>>>> Currently, we have the micro benchmark for flink, which can help us
> >> find
> >>>>> the regressions. And I think the e2e jobs performance testing can
> also
> >>>> help
> >>>>> us to cover more scenarios.
> >>>>>
> >>>>> Best,
> >>>>> Congxian
> >>>>>
> >>>>>
> >>>>> Jingsong Li <jingsongl...@gmail.com> 于2019年11月4日周一 下午5:37写道：
> >>>>>
> >>>>>> +1 for the idea. Thanks Yu for driving this.
> >>>>>> Just curious about that can we collect the metrics about Job
> >>>> scheduling
> >>>>> and
> >>>>>> task launch. the speed of this part is also important.
> >>>>>> We can add tests for watch it too.
> >>>>>>
> >>>>>> Look forward to more batch test support.
> >>>>>>
> >>>>>> Best,
> >>>>>> Jingsong Lee
> >>>>>>
> >>>>>> On Mon, Nov 4, 2019 at 10:00 AM OpenInx <open...@gmail.com> wrote:
> >>>>>>
> >>>>>>>> The test cases are written in java and scripts in python. We
> >>>> propose
> >>>>> a
> >>>>>>> separate directory/module in parallel with flink-end-to-end-tests,
> >>>> with
> >>>>>> the
> >>>>>>>> name of flink-end-to-end-perf-tests.
> >>>>>>>
> >>>>>>> Glad to see that the newly introduced e2e test will be written in
> >>>> Java.
> >>>>>>> because  I'm re-working on the existed e2e tests suites from BASH
> >>>>> scripts
> >>>>>>> to Java test cases so that we can support more external system ,
> >>>> such
> >>>>> as
> >>>>>>> running the testing job on yarn+flink, docker+flink,
> >>>> standalone+flink,
> >>>>>>> distributed kafka cluster etc.
> >>>>>>> BTW, I think the perf e2e test suites will also need to be
> >> designed
> >>>> as
> >>>>>>> supporting running on both standalone env and distributed env.
> >> will
> >>>> be
> >>>>>>> helpful
> >>>>>>> for developing & evaluating the perf.
> >>>>>>> Thanks.
> >>>>>>>
> >>>>>>> On Mon, Nov 4, 2019 at 9:31 AM aihua li <liaihua1...@gmail.com>
> >>>> wrote:
> >>>>>>>
> >>>>>>>> In stage1, the checkpoint mode isn't disabled,and uses heap as
> >> the
> >>>>>>>> statebackend.
> >>>>>>>> I think there should be some special scenarios to test
> >> checkpoint
> >>>> and
> >>>>>>>> statebackend, which will be discussed and added in the
> >>>> release-1.11
> >>>>>>>>
> >>>>>>>>> 在 2019年11月2日，上午12:13，Yun Tang <myas...@live.com> 写道：
> >>>>>>>>>
> >>>>>>>>> By the way, do you think it's worthy to add a checkpoint mode
> >>>> which
> >>>>>>> just
> >>>>>>>> disable checkpoint to run end-to-end jobs? And when will stage2
> >>>> and
> >>>>>>> stage3
> >>>>>>>> be discussed in more details?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Best, Jingsong Lee
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

Reply via email to