Hello,

I have a problem with basic client - server application. I tried to run C
program from this website
https://github.com/hpc/cce-mpi-openmpi-1.7.1/blob/master/orte/test/mpi/singleton_client_server.c
I saw this program mentioned in many discussions in your website, so I
expected that it should work properly, but after more testing I found out
that there is probably an error somewhere in connect/accept method. I have
read many discussions and threads on your website, but I have not found
similar problem that I am facing. It seems that nobody had similar problem
like me. When I run this app with one server and more clients (3,4,5,6,...)
sometimes the app hangs. It hangs when second or next client wants to
connect to the server (it depends, sometimes third client hangs, sometimes
fourth, sometimes second, and so on).
So it means that app starts to hang where server waits for accept and
client waits for connect. And it is not possible to continue, because this
client cannot connect to the server. It is strange, because I observed this
behaviour only in some cases... Sometimes it works without any problems,
sometimes it does not work. The behaviour is unpredictable and not stable.

I have installed openmpi 1.10.2 on my Fedora 19. I have the same problem
with Java alternative of this application. It hangs also sometimes... I
need this app in Java, but firstly it must work properly in C
implementation. Because of this strange behaviour I assume that there can
be an error maybe inside of openmpi implementation of connect/accept
methods. I tried it also with another version of openmpi - 1.8.1. However,
the problem did not disappear.

Could you help me, what can cause the problem? Maybe I did not get
something about openmpi (or connect/server) and the problem is with me... I
will appreciate any your help, support, or interest about this topic.

Best regards,
Matus Dobrotka

Reply via email to