On Fri, Mar 5, 2010 at 12:03 PM, Josh Hursey <jjhur...@open-mpi.org> wrote:
> This type of failure is usually due to prelink'ing being left enabled on one
> or more of the systems. This has come up multiple times on the Open MPI
> list, but is actually a problem between BLCR and the Linux kernel. BLCR has
> a FAQ entry on this that you will want to check out:
>  https://upc-bugs.lbl.gov//blcr/doc/html/FAQ.html#prelink
>
> If that does not work, then we can look into other causes.

I also suggest checkpointing and restarting the app with BLCR
directly. I.e., take any simple app, run it with cr_run, checkpoint it
with cr_checkpoint then restart it with cr_restart. Make sure the blcr
module is loaded too. That way you can tell whether it's related to
OpenMPI or not.

Regards,

Reply via email to