<<<Also, will ant's ForEach take a set of say 30 things to work on, and take the # threads to use, and just pull from that queue of 30, in order?>>>
That's the implication I took from here: http://ant-contrib.sourceforge.net/tasks/tasks/index.html Ignorance is bliss, I didn't find the ForEach by looking at Ant documentation, but by googling "ant parallel". Turns out this is in Contrib. I don't even know if it's current. Tell ya' what. I'll take a quick whack at it. I'm a believer in prototyping if at all possible. So I'll create a really stupid implementation of this with a hard-coded list of tests to run and see what happens. If it works for me, I'll pass it along to whoever wants to give it a spin and we'll get a clue whether it provides enough of an improvement to pursue seriously. I'll open a JIRA since at least Mike and I seem to be interested.... Erick On Fri, Nov 27, 2009 at 1:27 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > On Fri, Nov 27, 2009 at 10:52 AM, Erick Erickson > <erickerick...@gmail.com> wrote: > > But then I got to thinking..... I admit I've only scratched the > > surface of the JUnit4 parallelization stuff. That said, it > > seems like the real benefit comes from making use of > > multiple cores, we don't get huge speedups just from > > running multiple threads at once on a single core. Which > > makes sense if you're not doing much in the way of I/O. > > Right, it's the multi-core machines that gain the most from this. > > > This notion was inspired by the "scary Python script" > > comment..... > > > > So what if we use Ant ForEach construct instead? Yet > > again this is a fuzzy idea I'm throwing out without much > > to back it up. Mostly I'm wondering if anyone's thought about > > it before or can shoot it down before it takes wing. Or if > > it is worth exploring. > > > > Assuming we structure our test directories so there are only > > directories at the root of the test area, could we persuade Ant > > to fire off the tests N directories at a time in parallel? > > N would default to 1 but could be passed in to the task, something > > like -DmaxThreads=4. ForEach actually has a maxThreads > > parameter..... In fact, we wouldn't even need to have only directories > > at the test root, but the individual test files at the root would > probably > > be inefficiently run. > > > > I suspect that keeping the test directories in balance would be > > much less work that trying to parallelize using JUnit4, and be > > much less fraught with gremlins. This assumes we get > > sufficient isolation by Ant running separate threads, about > > which I have absolutely NO information. Like I said, mostly > > I'm wondering if anybody's gone down this path before and > > has wisdom to offer. > > I think this rough idea is a good approach, though I don't know much > about ant's ForEach. > > One thing the scary Python script does is divide up index & search > packages into 2 parts ("a" and "b"), by breaking up the tests > according to 1st letter. We might be able to take a similar approach, > so that we're not forced to unnaturally separate tests into subdirs? > > The entire index or search package was too slow to run otherwise (ie, > I needed to throw concurrency at it). > > > Which *still* doesn't mean we shouldn't do whatever we can > > to speed up individual tests, but looking that the timings there's > > no obvious low-hanging fruit.... > > Yup. It's definitely an ongoing thing too... > > > I wonder if we could somehow run the various directories in > > time order, longest-to-shortest in the hope that all the threads > > would finish up "close enough" to the same time. I haven't > > thought about *how* to make this happen yet though.... > > This is very important -- I do the same thing in the python script. > > Also, will ant's ForEach take a set of say 30 things to work on, and > take the # threads to use, and just pull from that queue of 30, in > order? > > > Anyway, I'll be happy to pursue this if y'all think it has merit, > > let me know and I'll open a JIRA and take it on. For the > > benefit of those aforementioned *real* people with *real* > > machines, who I'll rely upon to help test this notion.... > > > > Is the poor-mans version of this on a dual-core machine > > just running "test-core" and "test-contrib" in two separate > > windows? > > I think you could, except, I think they share sub-tasks (eg, > "compile-core") so the two will sometimes stomp on each other. > > The scary python script first uses a single thread to compile > everything, then runs N threads pulling from the queue. BUT: I apply > a temporary patch to the ant build files, so that the N threads do not > try to, eg, compile-core or jar-core, separately. > > Also one thing I'd love to try is NOT forking the JVM for each test > (fork="no" in the junit task). I wonder how much time that'd buy... > > Mike > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > >