On Feb 18, 2010, at 10:48 AM, Ethan Mallove wrote:
> To ensure there is never a collision between $a->{k} and $b->{k}, the
> user can have two MTT clients share a $scratch, but they cannot both
> run the same INI section simultaneously. I setup my scheduler to run
> batches of MPI get, MPI install, Test get, Test build, and Test run
> sections in parallel with successor INI sections dependent on their
> predecessor INI sections (e.g., [Test run: foo] only runs after [Test
> build: foo] completes). The limitation stinks, but the current
> limitation is much worse: two MTT clients can't even run the same
> *phase* out of one $scratch.
Maybe it might be a little nicer just to protect the user from themselves -- if
we ever detect a case where $a->{k} and $b->{k} both exist and are not the same
value, dump out everything to a file and abort with an error message. This is
clearly an erroneous situation, but running MTT in big parallel batches like
this is a worthwhile-but-complicated endeavor, and some people are likely to
get it wrong. So we should at least detect the situation and fail gracefully,
rather than losing or corrupting results.
Make sense?
> I originally wanted the .dump files to be completely safe, but MTT
> clients were getting locked out of the .dump files for way too long.
> E.g., MTT::MPI::LoadInstalls happens very early in client/mtt, and an
> hour could elapse before MTT::MPI::SaveInstalls is called in
> Install.pm.
Yep, if you lock from load->save, then that can definitely happen...
--
Jeff Squyres
[email protected]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/