Thanks for the reply Ralph. I will look for a way to deal with this situation for the moment.
Regards. Hugo 2013/5/6 Ralph Castain <r...@open-mpi.org> > We are working towards thread safety, but nowhere near ready yet. > > On May 6, 2013, at 3:39 AM, Hugo Daniel Meyer <meyer.h...@gmail.com> > wrote: > > Sorry, i've sent the message without finishing it. > > Hello to @ll. > > I'm not sure if this is the correct list to post this question, but maybe > i'm dealing with a bug. > > I have develop an event logging mechanism where application processes > connect to event loggers (using MPI_Lookup, MPI_open_port, > MPI_Comm_Connect, MPI_Comm_Accept, etc) that are part of another MPI > application. > > Well, i have develop my own vprotocol where once a process receive a > message try to establish a connection with an event logger which is a > thread that belongs to another mpi application. > > The event logger application consists in one mpi process per node with > multiple threads waiting for connections of MPI processes from the main > application. > > I'm suspecting that there is a problem with the critical regions when > processes try to connect with the threads of the event logger. > > I'm attaching two short examples that i have written in order to show the > problem. First, i launch the event-logger application: > > mpirun -n 2 --machinefile machinefile2-th --report-uri URIFILE > ./test-thread > > Then i launch the example as this: > > mpirun -n 16 --machinefile machine16 --ompi-server file:URIFILE > ./thread_logger_connect > > I have obtained this output: > > *Published: radic_eventlog[1,6], ret=0* > *[clus2:16104] [[39125,1],1] ORTE_ERROR_LOG: Data unpack would read past > end of buffer in file dpm_orte.c at line 315* > *[clus2:16104] [[39125,1],1] ORTE_ERROR_LOG: Data unpack would read past > end of buffer in file dpm_orte.c at line 315* > *[clus2:16104] *** An error occurred in MPI_Comm_accept* > *[clus2:16104] *** on communicator MPI_COMM_SELF* > *[clus2:16104] *** MPI_ERR_UNKNOWN: unknown error* > *[clus2:16104] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)* > * > -------------------------------------------------------------------------- > * > *mpirun has exited due to process rank 1 with PID 16104 on* > *node clus2 exiting improperly. There are two reasons this could occur:* > * > * > *1. this process did not call "init" before exiting, but others in* > *the job did. This can cause a job to hang indefinitely while it waits* > *for all processes to call "init". By rule, if one process calls "init",* > *then ALL processes must call "init" prior to termination.* > * > * > *2. this process called "init", but exited without calling "finalize".* > *By rule, all processes that call "init" MUST call "finalize" prior to* > *exiting or it will be considered an "abnormal termination"* > * > * > *This may have caused other processes in the application to be* > *terminated by signals sent by mpirun (as reported here).* > > > If i use mutex in order to serialized the access to MPI_Comm_Accept, the > behavior is ok, but shoudn't the MPI_comm_accept be thread safe? > > Best regards. > > Hugo Meyer > > P.d.: This occurs with openmpi1.5.1 and also with also with an old version > of the trunk (1.7). > > > 2013/5/6 Hugo Daniel Meyer <meyer.h...@gmail.com> > >> Hello to @ll. >> >> I'm not sure if this is the correct list to post this question, but maybe >> i'm dealing with a bug. >> >> I have develop an event logging mechanism where application processes >> connect to event loggers (using MPI_Lookup, MPI_open_port, >> MPI_Comm_Connect, MPI_Comm_Accept, etc) that are part of another MPI >> application. >> >> Well, i have develop my own vprotocol where once a process receive a >> message try to establish a connection with an event logger which is a >> thread that belongs to another mpi application. >> >> The event logger application consists in one mpi process per node with >> multiple threads waiting for connections of MPI processes from the main >> application. >> >> I'm suspecting that there is a problem with the critical regions when >> processes try to connect with the threads of the event logger. >> >> I'm attaching two short examples that i have written in order to show the >> problem. First, i launch the event-logger application: >> >> >> >> If i use mutex in order to serialized the access to MPI_Comm_Accept, >> >> >> >> > <event_logger.c><main-mpi-app.c> > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >