Hi,It seems that there is a notable overlap with what Nexmark already
does:Nexmark mesures performance and regression by
exercising all the Beam model in both batch and streaming modes with several
runners. It also computes on synthetic
data. Also nexmark is already included as PostCommits in the CI and dashboards.
Shall we merge the two?
Best
Etienne
Le lundi 10 septembre 2018 à 12:56 +0200, Łukasz Gajowy a écrit :
> Hello everyone,
>
> thank you for all your comments to the proposal. To sum up:
>
> A set of performance tests exercising Core Beam Transforms (ParDo,
> GroupByKey, CoGroupByKey, Combine) will be
> implemented for Java and Python SDKs. Those tests will allow to:
> measure performance of the transforms on various runners
> exercise the transforms by creating stressful conditions and big loads using
> Synthetic Source and Synthetic Step API
> (delays, keeping cpu busy or asleep, processing large keys and values,
> performing fanout or reiteration of inputs)
> run both in batch and streaming context
> gather various metrics
> notice regressions by comparing data from consequent Jenkins runs
> Metrics (runtime, consumed bytes, memory usage, split/bundle count) can be
> gathered during test invocations. We will
> start with runtime and leverage Metrics API to collect the other metrics in
> later phases of development.
> The tests will be fully configurable through pipeline options and it will be
> possible to run any custom scenarios
> manually. However, a representative set of testing scenarios will be run
> periodically using Jenkins.
>
> Regards,
> Łukasz
>
> śr., 5 wrz 2018 o 20:31 Rafael Fernandez <[email protected]> napisał(a):
> > neat! left a comment or two
> >
> > On Mon, Sep 3, 2018 at 3:53 AM Łukasz Gajowy <[email protected]> wrote:
> > > Hi all!
> > >
> > > I'm bumping this (in case you missed it). Any feedback and questions are
> > > welcome!
> > >
> > > Best regards,
> > > Łukasz
> > >
> > > pon., 13 sie 2018 o 13:51 Jean-Baptiste Onofré <[email protected]>
> > > napisał(a):
> > > > Hi Lukasz,
> > > >
> > > >
> > > >
> > > > Thanks for the update, and the abstract looks promising.
> > > >
> > > >
> > > >
> > > > Let me take a look on the doc.
> > > >
> > > >
> > > >
> > > > Regards
> > > >
> > > > JB
> > > >
> > > >
> > > >
> > > > On 13/08/2018 13:24, Łukasz Gajowy wrote:
> > > >
> > > > > Hi all,
> > > >
> > > > >
> > > >
> > > > > since Synthetic Sources API has been introduced in Java and Python
> > > > > SDK,
> > > >
> > > > > it can be used to test some basic Apache Beam operations (i.e.
> > > >
> > > > > GroupByKey, CoGroupByKey Combine, ParDo and ParDo with SideInput) in
> > > >
> > > > > terms of performance. This, in brief, is why we'd like to share the
> > > >
> > > > > below proposal:
> > > >
> > > > >
> > > >
> > > > > _https://docs.google.com/document/d/1PuIQv4v06eosKKwT76u7S6IP88AnXhTf870Rcj1AHt4/edit?usp=sharing_
> > > >
> > > > >
> > > >
> > > > > Let us know what you think in the document's comments. Thank you in
> > > >
> > > > > advance for all the feedback!
> > > >
> > > > >
> > > >
> > > > > Łukasz
> > > >
> > > >
> > > >