On Fri, Nov 27, 2009 at 10:52 AM, Erick Erickson <erickerick...@gmail.com> wrote: > But then I got to thinking..... I admit I've only scratched the > surface of the JUnit4 parallelization stuff. That said, it > seems like the real benefit comes from making use of > multiple cores, we don't get huge speedups just from > running multiple threads at once on a single core. Which > makes sense if you're not doing much in the way of I/O.
Right, it's the multi-core machines that gain the most from this. > This notion was inspired by the "scary Python script" > comment..... > > So what if we use Ant ForEach construct instead? Yet > again this is a fuzzy idea I'm throwing out without much > to back it up. Mostly I'm wondering if anyone's thought about > it before or can shoot it down before it takes wing. Or if > it is worth exploring. > > Assuming we structure our test directories so there are only > directories at the root of the test area, could we persuade Ant > to fire off the tests N directories at a time in parallel? > N would default to 1 but could be passed in to the task, something > like -DmaxThreads=4. ForEach actually has a maxThreads > parameter..... In fact, we wouldn't even need to have only directories > at the test root, but the individual test files at the root would probably > be inefficiently run. > > I suspect that keeping the test directories in balance would be > much less work that trying to parallelize using JUnit4, and be > much less fraught with gremlins. This assumes we get > sufficient isolation by Ant running separate threads, about > which I have absolutely NO information. Like I said, mostly > I'm wondering if anybody's gone down this path before and > has wisdom to offer. I think this rough idea is a good approach, though I don't know much about ant's ForEach. One thing the scary Python script does is divide up index & search packages into 2 parts ("a" and "b"), by breaking up the tests according to 1st letter. We might be able to take a similar approach, so that we're not forced to unnaturally separate tests into subdirs? The entire index or search package was too slow to run otherwise (ie, I needed to throw concurrency at it). > Which *still* doesn't mean we shouldn't do whatever we can > to speed up individual tests, but looking that the timings there's > no obvious low-hanging fruit.... Yup. It's definitely an ongoing thing too... > I wonder if we could somehow run the various directories in > time order, longest-to-shortest in the hope that all the threads > would finish up "close enough" to the same time. I haven't > thought about *how* to make this happen yet though.... This is very important -- I do the same thing in the python script. Also, will ant's ForEach take a set of say 30 things to work on, and take the # threads to use, and just pull from that queue of 30, in order? > Anyway, I'll be happy to pursue this if y'all think it has merit, > let me know and I'll open a JIRA and take it on. For the > benefit of those aforementioned *real* people with *real* > machines, who I'll rely upon to help test this notion.... > > Is the poor-mans version of this on a dual-core machine > just running "test-core" and "test-contrib" in two separate > windows? I think you could, except, I think they share sub-tasks (eg, "compile-core") so the two will sometimes stomp on each other. The scary python script first uses a single thread to compile everything, then runs N threads pulling from the queue. BUT: I apply a temporary patch to the ant build files, so that the N threads do not try to, eg, compile-core or jar-core, separately. Also one thing I'd love to try is NOT forking the JVM for each test (fork="no" in the junit task). I wonder how much time that'd buy... Mike --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org