Faster tests would be great. I recall that the straightforward ways to parallelize via Maven haven't worked because many tests collide with one another. Is this about running each module's tests in a container? that should work.
I can see how this is becoming essential for repeatable and reliable Python/R builds, which depend on the environment to a much greater extent than the JVM does. I don't have a strong preference for AMPLab vs ASF builds. I suppose using the ASF machinery is a little tidier. If it's got a later Jenkins that's required, also a plus, but I assume updating AMPLab isn't so hard here either. I think the key issue is which environment is easier to control and customize over time. On Wed, Nov 1, 2017 at 6:05 AM Xin Lu <x...@salesforce.com> wrote: > Hi everyone, > > I tried sending emails to this list and I'm not sure if it went through so > I'm trying again. Anyway, a couple months ago before I left Databricks I > was working on a proof of concept that parallelized Spark tests on > jenkins. The way it worked was basically it build the spark jars and then > ran all the tests in a docker container on a bunch of slaves in parallel. > This cut the testing time down from 4 hours to approximately 1.5 hours. > This required a newer version of jenkins and the Jenkins Pipeline plugin. > I am wondering if it is possible to do this on amplab jenkins. It looks > like https://builds.apache.org/ has upgraded so Amplabs jenkins is a year > or so behind. I am happy to help with this project if it is something that > people think is worthwhile. > > Thanks > > Xin >