Re: [OMPI users] Problem with Filem

2009-05-07 Thread Josh Hursey
I'm glad that the recent commits fixed your problem. At the moment, we do not implement a mirroring file storage mechanism (where peers save checkpoints to each others local disk). We have been working towards supporting this and other techniques in some off- trunk development, but nothing r

Re: [OMPI users] Problem with Filem

2009-05-07 Thread Bouguerra mohamed slim
Hello, Thank you, with the release r21172 and it works. But how i can dispatch the checkpoint on different storage nodes, because it is to costly that all computing nodes write on one storage node. Josh Hursey a écrit : I just realized that not all of the FileM fixes made it to the trunk i

Re: [OMPI users] Problem with Filem

2009-05-01 Thread Josh Hursey
This typically this means that one or more of the rcp/scp or rsh/ssh commands failed. FileM should be printing an error message when one of the copy commands fail. Try turning up the verbose level to 10 to see if it indicates any problems: -mca filem_rsh_verbose 10 Can you send me the MCA

[OMPI users] Problem with Filem

2009-04-30 Thread Bouguerra mohamed slim
Hello, I have a problem with the Filem module when i would checkpoint on a remote host without shared space file system. I use the new open-mpi 1.3.2 and it is the same problem as in the version 1.3.1. Indeed, when i use the NFS system file it works. Thus i guess that is a problem with the File