On Thu, Feb/18/2010 04:13:15PM, Jeff Squyres wrote: > On Feb 18, 2010, at 10:48 AM, Ethan Mallove wrote: > > > To ensure there is never a collision between $a->{k} and $b->{k}, the > > user can have two MTT clients share a $scratch, but they cannot both > > run the same INI section simultaneously. I setup my scheduler to run > > batches of MPI get, MPI install, Test get, Test build, and Test run > > sections in parallel with successor INI sections dependent on their > > predecessor INI sections (e.g., [Test run: foo] only runs after [Test > > build: foo] completes). The limitation stinks, but the current > > limitation is much worse: two MTT clients can't even run the same > > *phase* out of one $scratch. > > Maybe it might be a little nicer just to protect the user from > themselves -- if we ever detect a case where $a->{k} and $b->{k} > both exist and are not the same value, dump out everything to a file > and abort with an error message. This is clearly an erroneous > situation, but running MTT in big parallel batches like this is a > worthwhile-but-complicated endeavor, and some people are likely to > get it wrong. So we should at least detect the situation and fail > gracefully, rather than losing or corrupting results. > > Make sense?
Yes. I'll add this. -Ethan > > > I originally wanted the .dump files to be completely safe, but MTT > > clients were getting locked out of the .dump files for way too long. > > E.g., MTT::MPI::LoadInstalls happens very early in client/mtt, and an > > hour could elapse before MTT::MPI::SaveInstalls is called in > > Install.pm. > > Yep, if you lock from load->save, then that can definitely happen... > > -- > Jeff Squyres > jsquy...@cisco.com > > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > _______________________________________________ > mtt-users mailing list > mtt-us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users