Hello folks,I am trying to write some fault-tolerance systems with the following criteria:
1) Recover any software/hardware crashes 2) Dynamically Shrink and grow. 3) Migrate processes among machines.
Does anyone has examples of code? What MPI platform is recommended to accomplish such requirements?
I am using three MPI platforms and each has it own issues:1) MPICH2 - good multi-threading support, but bad fault-tolerance mechanisms. 2) OpenMPI - Does not support multi-threading properly and cannot have it trap exceptions yet.
3) FT-MPI - Old and does not support multi-threading at all. Any suggestions? -- Regards, Mohammad Huwaidi We can't resolve problems by using the same kind of thinking we used when we created them. --Albert Einstein
<<attachment: mohammad.vcf>>