On Mon, Oct 01, 2018 at 11:58:25AM +0200, Daniel Gustafsson via curl-library wrote: > I was poking a little at parallelizing the test suite in order to try and > shave > some time off the total runtime. But before sinking time into that I wanted > to > ask if there are/have been any other attempts at this? Has anyone hacked on > this and if so, are there any learnings that can be shared?
I have a local branch I started after the discussions during curl://up Nürnberg on the topic. I got 80% of the way to running tests on different protocols in parallel (I could do it but it required manually starting two test harnesses and selecting nonconflicting test ranges). Most of the changes involved making each test independent in its use of input and output files so there would be no conflicts at run-time. The test code has changed in the 1.5 years since, so it would be a bit of work to rebase it all to the current code. In retrospect, I probably should have checked in each part as it was complete, but since few of the changes on the way really helped improve curl without the entire thing being in place, I didn't. The other problem with the approach I was working on, namely, parallelizing by protocol but coordinating all the tests from a single test harness with a single set of test server (as it is mostly done today), is that the speedup would be limited. While it would be relatively straightforward to implement, you wouldn't see more than a 2× speedup since more than half the tests involve a single protocol, HTTP. A slightly different approach would involve starting N entire test harnesses in parallel, each responsible for its own suite of test servers running on its own range of test ports and running in its own tests/log/ directory. Since nothing would be shared between the test servers besides the input files, there would be no limit to the number of test harnesses (and therefore parallel tests) that could be run at once. If I were looking at the problem again, that's the approach I would take. It might even end up requiring fewer changes than the approach I was working on. To give a quick sketch, the test harness would open a pipe and fork() itself fairly soon after startup, putting the forked copies into slave mode and listening on the pipe for instructions. The master test harness would determine the tests to run as it does today, but rather than running them itself, it would send a message to one of the slaves that would actually run the test. The master would gather the result of each test run and display it to the user as it completes. The user wouldn't notice any difference except that the order of tests would no longer necessarily appear sequentially (a faster test result would be shown before a slower one that starts at the same time). Some of the infrastructure changes I made would be really useful in this approach as well, as I refactored the main test loop to make such a model easy to fit in. I can publish my branch somewhere if someone would find it useful, or even rebase the more generally-useful useful cleanups if you're not in a hurry. >>> Dan ------------------------------------------------------------------- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
