Hello again. My actual problem is that i don't know where is the struct that has the information that is used to send messages to the procs.
Something like: Rank URI 0 21222:tcp:192.168.1.1:1250 1 21223:tcp:192.168.1.2:1250 ..... ..... Because what i need is to update it when i move a process from its original site, is there something like this?? Thanks a lot. Hugo 2011/5/31 Hugo Meyer <meyer.h...@gmail.com> > Hello @ll. > > I'm needing some help to restart the communication with a process that i > restore in a different node. My situation is as follows: > > The process fails and it's restored in another node succesfully from a > previous checkpoint that i sent there. Now, when a process try to send a > message to this restored process it will fail, or at least, it will be > locked in *ompi_request_wait_completion. * > * > * > So, when this happens i have to send a message to the daemon of the sender > that will have the uri of where the process has been restored and answer to > the proc with this and it will update this info. > > So, i need to know where in the code i can capture this attempt to send and > then send the message to his daemon and where and how i can update this info > to send the message to the right place (Same rank but new uri). > > I have to do it in this way to avoid a collective communication. > > If you give me a hand on this, it will be great. > > Best regards. > > Hugo >