On Jul 14, 2008, at 5:18 PM, Sean Hefty wrote:
Open MPI certainly could be buggy with IBCM, of course -- but it's
fishy that the same exact "mpirun ..." command line works one time
and
fails the next (it's kinda random when the problem occurs).
I just want to make sure that service ID collision isn't the issue.
(It may be
unlikely, but it could happen.) Using the PID is random, and could
cause
conflicts with other services, depending on the value that's used.
I know SDP
reserve ranges of service ID values.
Ah! I did not realize that there were other services on the machine
that were using / reserving IBCM service ID's.
Is there a service ID range that is guaranteed to be available for
user apps?
Is the service ID specified in host or network order?
Host order -- just the result of getpid().
Do you know the range of
PIDs? I can see if any well known apps might collide.
I never looked at the range of PIDs that failed. Pasha / Brad --
could you look into this? It might be that simple...
--
Jeff Squyres
Cisco Systems