I agree that we can benefit from having two types of performance tests (low and high level) that could complement each other. Can we detect a regression (if any) automatically and send a report about that? Sorry if we already do that for Nexmark.
> On 11 Sep 2018, at 11:29, Etienne Chauchot <[email protected]> wrote: > > Hi Lukasz, > > Well, having low level byte[] based pure performance tests makes sense. And > having high level realistic model (Nexmark auction system) makes sense also > to avoid testing unrealistic pipelines as you describe. > > Have common code between the 2 seems difficult as both the architecture and > the model are different. > > I'm more concerned about having two CI mechanisms to detect > functionnal/performance regressions. > Best > Etienne > > Le lundi 10 septembre 2018 à 18:33 +0200, Łukasz Gajowy a écrit : >> In my opinion and as far as I understand Nexmark, there are some benefits to >> having both types of tests. The load tests we propose can be very >> straightforward and clearly show what is being tested thanks to the fact >> that there's no fixed model but very "low level" KV<byte[], byte[]> >> collections only. They are more flexible in shapes of the pipelines they can >> express e.g. fanout_64, without having to think about specific use cases. >> >> Having both types would allow developers to decide whether they want to >> create a new Nexmark query for their specific case or develop a new Load >> test (whatever is easier and more fits their case). However, there is a risk >> - with KV<byte[], byte[]> developer can overemphasize cases that can never >> happen in practice, so we need to be careful about the exact configurations >> we run. >> >> Still, I can imagine that there surely will be code that should be common to >> both types of tests and we seek ways to not duplicate code. >> >> WDYT? >> >> Regards, >> Łukasz >> >> >> >> pon., 10 wrz 2018 o 16:36 Etienne Chauchot <[email protected] >> <mailto:[email protected]>> napisał(a): >>> Hi, >>> It seems that there is a notable overlap with what Nexmark already does: >>> Nexmark mesures performance and regression by exercising all the Beam model >>> in both batch and streaming modes with several runners. It also computes on >>> synthetic data. Also nexmark is already included as PostCommits in the CI >>> and dashboards. >>> >>> Shall we merge the two? >>> >>> Best >>> >>> Etienne >>> >>> Le lundi 10 septembre 2018 à 12:56 +0200, Łukasz Gajowy a écrit : >>>> Hello everyone, >>>> >>>> thank you for all your comments to the proposal. To sum up: >>>> >>>> A set of performance tests exercising Core Beam Transforms (ParDo, >>>> GroupByKey, CoGroupByKey, Combine) will be implemented for Java and Python >>>> SDKs. Those tests will allow to: >>>> measure performance of the transforms on various runners >>>> exercise the transforms by creating stressful conditions and big loads >>>> using Synthetic Source and Synthetic Step API (delays, keeping cpu busy or >>>> asleep, processing large keys and values, performing fanout or reiteration >>>> of inputs) >>>> run both in batch and streaming context >>>> gather various metrics >>>> notice regressions by comparing data from consequent Jenkins runs >>>> Metrics (runtime, consumed bytes, memory usage, split/bundle count) can be >>>> gathered during test invocations. We will start with runtime and leverage >>>> Metrics API to collect the other metrics in later phases of development. >>>> The tests will be fully configurable through pipeline options and it will >>>> be possible to run any custom scenarios manually. However, a >>>> representative set of testing scenarios will be run periodically using >>>> Jenkins. >>>> >>>> Regards, >>>> Łukasz >>>> >>>> śr., 5 wrz 2018 o 20:31 Rafael Fernandez <[email protected] >>>> <mailto:[email protected]>> napisał(a): >>>>> neat! left a comment or two >>>>> >>>>> On Mon, Sep 3, 2018 at 3:53 AM Łukasz Gajowy <[email protected] >>>>> <mailto:[email protected]>> wrote: >>>>>> Hi all! >>>>>> >>>>>> I'm bumping this (in case you missed it). Any feedback and questions are >>>>>> welcome! >>>>>> >>>>>> Best regards, >>>>>> Łukasz >>>>>> >>>>>> pon., 13 sie 2018 o 13:51 Jean-Baptiste Onofré <[email protected] >>>>>> <mailto:[email protected]>> napisał(a): >>>>>>> Hi Lukasz, >>>>>>> >>>>>>> Thanks for the update, and the abstract looks promising. >>>>>>> >>>>>>> Let me take a look on the doc. >>>>>>> >>>>>>> Regards >>>>>>> JB >>>>>>> >>>>>>> On 13/08/2018 13:24, Łukasz Gajowy wrote: >>>>>>> > Hi all, >>>>>>> > >>>>>>> > since Synthetic Sources API has been introduced in Java and Python >>>>>>> > SDK, >>>>>>> > it can be used to test some basic Apache Beam operations (i.e. >>>>>>> > GroupByKey, CoGroupByKey Combine, ParDo and ParDo with SideInput) in >>>>>>> > terms of performance. This, in brief, is why we'd like to share the >>>>>>> > below proposal: >>>>>>> > >>>>>>> > _https://docs.google.com/document/d/1PuIQv4v06eosKKwT76u7S6IP88AnXhTf870Rcj1AHt4/edit?usp=sharing_ >>>>>>> > >>>>>>> > <https://docs.google.com/document/d/1PuIQv4v06eosKKwT76u7S6IP88AnXhTf870Rcj1AHt4/edit?usp=sharing_> >>>>>>> > >>>>>>> > Let us know what you think in the document's comments. Thank you in >>>>>>> > advance for all the feedback! >>>>>>> > >>>>>>> > Łukasz >>>>>>>
