I really like these. Happy to have them. Best -P. On Fri, Mar 15, 2019 at 11:16 AM Łukasz Gajowy <[email protected]> wrote:
> Hi Beamers, > > an update on this. Together With Kasia, Michał and cooperating closely > with Pablo we have created and scheduled a Cron Job running daily 7 tests > for GroupByKey batch scenarios. Description of the tests is in the proposal > [1] and will be documented later. The dashboards for the tests: > - showing run times [2] > - showing total load size (bytes) [3] > > All the metrics are collected using Beam's Metrics API. > > Things we have on our horizon: > - the same set of tests for Java but in streaming mode > - similar jobs for Python SDK > - running similar suites on Flink runner > > We have also created a set of Dataproc bash scripts that can be used to > set up a Flink cluster that supports portability [4]. It is ready to use > and I've already successfully run the word count example using Python SDK > on it. Hoping + aiming to run load tests on it soon. :) > > BTW/Last but not least: we also reused some code to collect metrics using > Metrics API in TextIOIT too and are willing to do a similar change for > other IOITs. Dashboards for TextIOIT: [5]. > > Thanks, > Łukasz > > [1] https://s.apache.org/load-test-basic-operations > [2] > https://apache-beam-testing.appspot.com/explore?dashboard=5643144871804928 > [3] > https://apache-beam-testing.appspot.com/explore?dashboard=5701325169885184 > [4] > https://github.com/apache/beam/blob/b1ed061fd0c1ed1da562089c939d55715907769d/.test-infra/dataproc/create_flink_cluster.sh > [5] > https://apache-beam-testing.appspot.com/explore?dashboard=5629522644828160 > > > > śr., 12 wrz 2018 o 14:23 Etienne Chauchot <[email protected]> > napisał(a): > >> Let me elaborate a bit my last sentence >> Le mardi 11 septembre 2018 à 11:29 +0200, Etienne Chauchot a écrit : >> >> Hi Lukasz, >> >> Well, having low level byte[] based pure performance tests makes sense. >> And having high level realistic model (Nexmark auction system) makes sense >> also to avoid testing unrealistic pipelines as you describe. >> >> Have common code between the 2 seems difficult as both the architecture >> and the model are different. >> >> I'm more concerned about having two CI mechanisms to detect >> functionnal/performance regressions. >> >> >> Even if parts of NexMark and performance tests are the same they could >> target different objectives: raw performance tests (the new framework) and >> user oriented tests (nexmark). So they might be complementary. >> >> We must just chose how to run them. I think we need to have only one >> automatic regression detection tool. IMHO, the most relevant for func/perf >> regression is Nexmark because it represents what a real user could do (it >> simulates an auction system). So let's keep it as post commits. Post >> commits allow to target a particular commit that introduced a regression. >> >> We could schedule the new performance tests. >> >> Best >> Etienne >> >> >> Best >> Etienne >> >> Le lundi 10 septembre 2018 à 18:33 +0200, Łukasz Gajowy a écrit : >> >> In my opinion and as far as I understand Nexmark, there are some benefits >> to having both types of tests. The load tests we propose can be very >> straightforward and clearly show what is being tested thanks to the fact >> that there's no fixed model but very "low level" KV<byte[], byte[]> >> collections only. They are more flexible in shapes of the pipelines they >> can express e.g. fanout_64, without having to think about specific use >> cases. >> >> Having both types would allow developers to decide whether they want to >> create a new Nexmark query for their specific case or develop a new Load >> test (whatever is easier and more fits their case). However, there is a >> risk - with KV<byte[], byte[]> developer can overemphasize cases that can >> never happen in practice, so we need to be careful about the exact >> configurations we run. >> >> Still, I can imagine that there surely will be code that should be common >> to both types of tests and we seek ways to not duplicate code. >> >> WDYT? >> >> Regards, >> Łukasz >> >> >> >> pon., 10 wrz 2018 o 16:36 Etienne Chauchot <[email protected]> >> napisał(a): >> >> Hi, >> It seems that there is a notable overlap with what Nexmark already does: >> Nexmark mesures performance and regression by exercising all the Beam >> model in both batch and streaming modes with several runners. It also >> computes on synthetic data. Also nexmark is already included as PostCommits >> in the CI and dashboards. >> >> Shall we merge the two? >> >> Best >> >> Etienne >> >> Le lundi 10 septembre 2018 à 12:56 +0200, Łukasz Gajowy a écrit : >> >> Hello everyone, >> >> thank you for all your comments to the proposal. To sum up: >> >> A set of performance tests exercising Core Beam Transforms (ParDo, >> GroupByKey, CoGroupByKey, Combine) will be implemented for Java and Python >> SDKs. Those tests will allow to: >> >> - measure performance of the transforms on various runners >> - exercise the transforms by creating stressful conditions and big >> loads using Synthetic Source and Synthetic Step API (delays, keeping cpu >> busy or asleep, processing large keys and values, performing fanout or >> reiteration of inputs) >> - run both in batch and streaming context >> - gather various metrics >> - notice regressions by comparing data from consequent Jenkins runs >> >> Metrics (runtime, consumed bytes, memory usage, split/bundle count) can >> be gathered during test invocations. We will start with runtime and >> leverage Metrics API to collect the other metrics in later phases of >> development. >> The tests will be fully configurable through pipeline options and it will >> be possible to run any custom scenarios manually. However, a representative >> set of testing scenarios will be run periodically using Jenkins. >> >> Regards, >> Łukasz >> >> śr., 5 wrz 2018 o 20:31 Rafael Fernandez <[email protected]> >> napisał(a): >> >> neat! left a comment or two >> >> On Mon, Sep 3, 2018 at 3:53 AM Łukasz Gajowy <[email protected]> wrote: >> >> Hi all! >> >> I'm bumping this (in case you missed it). Any feedback and questions are >> welcome! >> >> Best regards, >> Łukasz >> >> pon., 13 sie 2018 o 13:51 Jean-Baptiste Onofré <[email protected]> >> napisał(a): >> >> Hi Lukasz, >> >> Thanks for the update, and the abstract looks promising. >> >> Let me take a look on the doc. >> >> Regards >> JB >> >> On 13/08/2018 13:24, Łukasz Gajowy wrote: >> > Hi all, >> > >> > since Synthetic Sources API has been introduced in Java and Python SDK, >> > it can be used to test some basic Apache Beam operations (i.e. >> > GroupByKey, CoGroupByKey Combine, ParDo and ParDo with SideInput) in >> > terms of performance. This, in brief, is why we'd like to share the >> > below proposal: >> > >> > _ >> https://docs.google.com/document/d/1PuIQv4v06eosKKwT76u7S6IP88AnXhTf870Rcj1AHt4/edit?usp=sharing_ >> > >> > Let us know what you think in the document's comments. Thank you in >> > advance for all the feedback! >> > >> > Łukasz >> >>
