Hello. Some times ago i run study MPI (openmpi). I need to write application (client/server) runs on 50 servers in parallel. Each application can communicate with others by tcp/ip (send commands, doing some parallel computations).
Master - controls all clients - slaves (send control commands, if needed restart clients). If master machine with server application die, some other server need to recive master role and controls other slaves. Can i do this things with openmpi? Or i need to write standart tcp/ip client/server application? I'm try to read some search results in google like this - http://docs.sun.com/source/819-7480-11/ExecutingPrograms.htmlaopenmpi% 20orted%20persistent%20daemon but orted return error: orted --daemonize [mobile:24107] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file runtime/orte_init.c at line 125 -------------------------------------------------------------------------- It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): orte_ess_base_select failed --> Returned value Not found (-13) instead of ORTE_SUCCESS Thank You. Sorry for my poor english. -- Vasiliy G Tolstov <v.tols...@selfip.ru> Selfip.Ru