On Tue, Sep 09, 2014 at 10:57:09AM +0000, VandeVondele Joost wrote: > > No. As I wrote earlier, splitting on filenames and test counts only is only > > very rough split, all the splits really need to be backed out by real timing > > data from popular targets. > > Furthermore, for parallel performance, it is not > so important that times are distributed evenly (it is anyway unlikely the > number of goals is exactly divided by N of -jN), but rather that the goals > are ordered (executed) from slow to fast (similar to omp schedule guided). > Most of the real bottlenecks are single letter patterns (e.g. p* since > prxxxx is such a common filename), and this is ultimately limiting.
I disagree. If e.g. in gcc.dg/ more than a third of testcases are pr*.c, then running dg.exp=p* in one job and dg.exp=a* in another one etc. is simply a bad idea, the pr*.c should be split more and some other letters just be done together. Even that can be done semi-automatically. If you get whitespace right, one can provide multiple different wildcards to a single *.exp file, e.g. make check-gcc RUNTESTFLAGS="dg.exp='p[0-9A-Za-qs-z]* pr[9A-Za-z]*'" should cover all tests starting with p other than pr[0-8]*.c (where you could split say pr[0-2]* into another job, pr[3-5]* into another and pr[6-8]* into another. The fact that some check-gcc or check-gfortran test job is early in the list doesn't mean it will be started early, you need to consider also all other potentially long jobs like check-g++, check-target-libgomp, check-target-libstdc++-v3 etc. Jakub