Thanks for your work on this Jason! Perfkit looks like a great option. I'm excited to see performance testing for Beam coming together.
Since one of the benefits of using perfkit is that it can spin up data stores for IO testing in various environment, we should look into whether we can re-use that functionality to help with running our IO Integration Tests. It seems like a good option would be to have the IO integration tests run inside of perfkit, and perfkit manages spin-up/spin-down of the data stores, then just passes the info about the data stores on to the integration test. I'm not entirely familiar with jenkins/perfkit, so I might be missing something, but I believe that'd generally look like: 1. Jenkins starts perfkit for the test, passing in info about which test to run (need to look into how we would do that) 2. Perfkit spins up the data store necessary for the test (at this point, the perfkit service for that data store would talk to the container orchestration software we're using to create the instances needed) 3. Perfkit runs the IO transform's integration test, passing the information about the data store to the test (possibly via pipeline parameters?), and the integration test runs 4. The test finishes, and control passes back to perfkit - it cleans up the data stores. 5. Perfkit returns success/failure to jenkins I'd want to investigate more/play around with this a bit before saying this is definitely the right way to go, but it does seem to allow the ITs to use the advantages of perfkit without strongly coupling the integration tests to perfkit. S On Sat, Dec 10, 2016 at 8:04 AM Jean-Baptiste Onofré <[email protected]> wrote: Cool ! Please use the mailing list and Jira to sync effort. Thanks, Regards JB On 12/10/2016 05:00 PM, Otávio Carvalho wrote: > Awesome, Jason! > > I am also interested in contribute to this effort by building/porting > streaming microbenchmarks to Beam. > > I will make contact in the following weeks. > > Regards, > Otavio. > > 2016-12-09 15:11 GMT-02:00 Jean-Baptiste Onofré <[email protected]>: > >> Happy to help too ;) >> >> @Jason, as discussed together, I will send my config (Marathon JSON, >> Dockerfile, ...), I'm so sorry to be late on this. >> >> Regards >> JB >> >> >> On 12/09/2016 05:59 PM, Amit Sela wrote: >> >>> This is great Jason! >>> >>> Let me know if / how I can assist with Spark, or generally. >>> >>> Thanks, >>> Amit >>> >>> On Thu, Dec 8, 2016 at 9:01 PM Jason Kuster <[email protected] >>> .invalid> >>> wrote: >>> >>> Hey all, >>>> >>>> So as I mentioned on Stephen's IO Testing thread a few days ago I've been >>>> doing a bunch of investigating into performance testing frameworks. I've >>>> put all my thoughts into a doc here and I'd love to hear thoughts about >>>> my >>>> investigation and what I'm proposing going forward. >>>> >>>> https://docs.google.com/document/d/18ffP1vYurvNe92Efs_ >>>> 6hFFBDYC2dQEdWw135_GWZ2YU/view >>>> >>>> Copying from the earlier mail: >>>> The tl;dr version is that there are a number of tools out there, but that >>>> the best one I was able to find was a tool called PerfKit Benchmarker >>>> (PKB)[1]. As it turns out, they already had the ability to benchmark >>>> Spark >>>> (I have a PR out to extend the Spark functionality[2] and a couple more >>>> improvements in the works), and I've put together some additional work >>>> in a >>>> branch on my repository[3] to enable proof-of-concept Dataflow Java >>>> benchmarks. I'm pretty excited about it overall. >>>> >>>> [1] https://github.com/GoogleCloudPlatform/PerfKitBenchmarker >>>> [2] https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/pull/1214 >>>> [3] https://github.com/jasonkuster/PerfKitBenchmarker/tree/beam >>>> >>>> Looking forward to moving forward with this. >>>> >>>> Jason >>>> >>>> -- >>>> ------- >>>> Jason Kuster >>>> Apache Beam (Incubating) / Google Cloud Dataflow >>>> >>>> >>> >> -- >> Jean-Baptiste Onofré >> [email protected] >> http://blog.nanthrax.net >> Talend - http://www.talend.com >> > -- Jean-Baptiste Onofré [email protected] http://blog.nanthrax.net Talend - http://www.talend.com
