Thanks for reviewing it. When operating on average there should also be presented standard deviation. Higher deviation means that we have more distributed and extreme results results and lower deviation value means that results are closer the average. It will indicate that tests are going more stable, by having values closer to trend. That also indicate that occurrence of any anomalies is rather not a false positive. However would be really nice to run some queries over real data exported from BigQuery, and see which statistic parameters could also be useful.
2018-04-16 19:38 GMT+02:00 Jason Kuster <jasonkus...@google.com>: > Great suggestions -- added some comments. Do you have plans to add more > sophisticated analysis past just analyzing runtime relative to the 6d > average? > > On Mon, Apr 16, 2018 at 10:21 AM Łukasz Gajowy <lukasz.gaj...@gmail.com> > wrote: > >> That is correct - for now, we're running tests on Dataflow only. There >> were plans to run them on Spark and Flink (and possibly more runners) but >> we had some difficulties on the way. We decided to focus on Dataflow, at >> least for now. Currently the tests are quite flaky, so the priority is to >> make them more stable. Meanwhile we're providing the all necessary >> "infrastructure" (hence the anomaly detection proposal). >> >> If anyone is willing to contribute in this area, those seem to be the >> biggest blockers for Spark and Flink: >> https://issues.apache.org/jira/browse/BEAM-3370 >> https://issues.apache.org/jira/browse/BEAM-3371 >> >> Best regards, >> Łukasz >> >> 2018-04-16 18:28 GMT+02:00 Pablo Estrada <pabl...@google.com>: >> >>> This is very cool! >>> Are these dashboards for tests running on Dataflow only? Are there plans >>> for other runners? : ) >>> -P. >>> >>> On Mon, Apr 16, 2018 at 9:23 AM Chamikara Jayalath <chamik...@google.com> >>> wrote: >>> >>>> Thanks Dariusz. This sounds great. Added some comments. >>>> Also, +Jeff Gardner <gardn...@google.com> who has experience on >>>> performance regression analysis of integration tests. >>>> >>>> Thanks, >>>> Cham >>>> >>>> >>>> On Mon, Apr 16, 2018 at 2:58 AM Łukasz Gajowy <lukasz.gaj...@gmail.com> >>>> wrote: >>>> >>>>> @Etienne +1 to doing that! :) if we have both results (Nexmark and >>>>> IOITs) in BQ we could use the same (similar?) tools to detect anomalies >>>>> captured by Nexmark (if there's need for doing that). >>>>> >>>>> 2018-04-16 11:17 GMT+02:00 Etienne Chauchot <echauc...@apache.org>: >>>>> >>>>>> Very nice to see the dashboards ! >>>>>> >>>>>> Regarding Kenn's comment: Nexmark supports outputing the results to >>>>>> Bigquery so it could be easily integrated into the dashboards. We're, >>>>>> with >>>>>> Kenn, scheduling Nexmark runs. We could configure the output to bigquery >>>>>> dashboard tables ? >>>>>> WDYT? >>>>>> >>>>>> Etienne >>>>>> Le samedi 14 avril 2018 à 23:20 +0000, Kenneth Knowles a écrit : >>>>>> >>>>>> This is very cool. So is it easy for someone to integrate the >>>>>> proposal to regularly run Nexmark benchmarks and get those on the >>>>>> dashboard? (or a separate one to keep IOs in their own page) >>>>>> >>>>>> Kenn >>>>>> >>>>>> On Fri, Apr 13, 2018 at 9:02 AM Dariusz Aniszewski < >>>>>> dariusz.aniszew...@polidea.com> wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> *Hello Beam devs!As you might already noticed, together with Łukasz >>>>>> Gajowy, Kamil Szewczyk and Katarzyna Kucharczyk (all directly cc’d here) >>>>>> we’re working on adding some performance tests to the project. We were >>>>>> following directions from the Testing I/O Transforms in Apache Beam >>>>>> <https://beam.apache.org/documentation/io/testing/> site (which we plan >>>>>> to >>>>>> update in near future).We started from testing various FileBasedIOs as >>>>>> part >>>>>> of BEAM-3060 <https://issues.apache.org/jira/browse/BEAM-3060>. So far we >>>>>> have tests for: - TextIO (with and without compression)- AvroIO- XmlIO- >>>>>> TFRecordIOthat may run on following filesystems: - local- GCS- HDFS >>>>>> (except >>>>>> for TFRecordIO, see BEAM-3945 >>>>>> <https://issues.apache.org/jira/browse/BEAM-3945>)Besides FileBasedIOs we >>>>>> also covered: - HadoopInputFormatIO- MongoDBIO- JdbcIO (in this case test >>>>>> was there, but was disabled; we fixed it and enabled)- HCatalogIO >>>>>> (currently in PR <https://github.com/apache/beam/pull/5097>)While >>>>>> currently >>>>>> all the tests are maven-based, we responded to ongoing Gradle migration >>>>>> and >>>>>> created PR <https://github.com/apache/beam/pull/5003> that allows running >>>>>> them via Gradle.All of those tests are executed on daily basis on Apache >>>>>> Jenkins <https://builds.apache.org/> and their results are published to >>>>>> individual BigQuery tables. There is also a dashboard on which tests >>>>>> results may be viewed and >>>>>> compared:https://apache-beam-testing.appspot.com/explore?dashboard=5755685136498688 >>>>>> <https://apache-beam-testing.appspot.com/explore?dashboard=5755685136498688>As >>>>>> we have some amount of tests already, we’re currently working on a tool >>>>>> that will analyze the results and search for anomalies, so devs are >>>>>> notified if degraded performance is observed. You can find proposal >>>>>> document >>>>>> here:https://docs.google.com/document/d/1Cb7XVmqe__nA_WCrriAifL-3WCzbZzV4Am5W_SkQLeA >>>>>> <https://docs.google.com/document/d/1Cb7XVmqe__nA_WCrriAifL-3WCzbZzV4Am5W_SkQLeA>We >>>>>> welcome you to share your thoughts on performance tests in general as >>>>>> well >>>>>> as proposed solution for anomaly detection.Best,Dariusz Aniszewski* >>>>>> >>>>>> >>>>> -- >>> Got feedback? go/pabloem-feedback >>> <https://goto.google.com/pabloem-feedback> >>> >> >> > > -- > ------- > Jason Kuster > Apache Beam / Google Cloud Dataflow > > See something? Say something. go/jasonkuster-feedback >