Excellent!
We developers have talked about creating an FAQ entry for running at
large scale for a long time, but have never gotten a round tuit. I
finally filed a ticket to do this (https://svn.open-mpi.org/trac/ompi/ticket/1503
) -- these pending documentation tickets will likely be
Hi,
I am happy to state that I believe I have finally found the fix for the No
route to host error
The solution was to increase the ARP cache in the head node and also add a
few static ARP entries. The cache was running out sometime during the
program execution leading to connection
Simply to keep track of what's going on:
I checked the build environment for openmpi and the system's setting,
they were built using gcc 3.4.4 with -Os, which was reputed unstable and
problematic with this compiler version. I've asked Prasanna to rebuild
using -O2 but this could be a bit
Prasanna,
Please send me your /etc/make.conf and the contents of
/var/db/pkg/sys-cluster/openmpi-1.2.7/
You can package this with the following command line:
tar -cjf data.tbz /etc/make.conf /var/db/pkg/sys-cluster/openmpi-1.2.7/
And simply send me the data.tbz file.
Thanks,
Eric
Hi,
I did make sure at the beginning that only eth0 was activated on all the
nodes. Nevertheless, I am currently verifying the NIC configuration on all
the nodes and making sure things are as expected.
While trying different things, I did come across this peculiar error which I
had detailed in
Hi Prasanna, do you have any unusual ethernet interfaces on your
nodes? I have seen similar problems when using IP over Infiniband.
I'm not sure exactly why, but mixing interfaces of different types
(ib0 and eth0 for example) can sometimes cause these problems,
possibly because they are on
Hi,
I have verified the openMPI version to be 1.2.7 on all the nodes and also
ompi_info | grep thread is Thread support: posix (mpi: no, progress: no) on
these machines.
I get the error with and without -mca oob_tcp_listen_mode listen_thread.
Sometimes, the startup takes too long with the
On Sep 11, 2008, at 6:29 PM, Prasanna Ranganathan wrote:
I have tried the following to no avail.
On 499 machines running openMPI 1.2.7:
mpirun -np 499 -bynode -hostfile nodelist /main/mpiHelloWorld ...
With different combinations of the following parameters
-mca btl_base_verbose 1 -mca
Prasanna,
I opened up a bug report to enable a better control over the
threading options (http://bugs.gentoo.org/show_bug.cgi?id=237435). In
the meanwhile, if your helloWorld isn't too fluffy, could you send it
over (off list if you prefer) so I can take a look at it, the
Segmentation
Jeff Squyres wrote:
On Sep 11, 2008, at 3:27 PM, Eric Thibodeau wrote:
Ok, added to the information from the README, I'm thinking none of
the 3 configure options have an impact on the said 'threaded TCP
listener' and the MCA option you suggested should still work, is this
correct?
It
On Sep 11, 2008, at 3:27 PM, Eric Thibodeau wrote:
Ok, added to the information from the README, I'm thinking none of
the 3 configure options have an impact on the said 'threaded TCP
listener' and the MCA option you suggested should still work, is
this correct?
It should default to
Jeff Squyres wrote:
On Sep 11, 2008, at 2:38 PM, Eric Thibodeau wrote:
In short:
Which of the 3 options is the one known to be unstable in the following:
--enable-mpi-threadsEnable threads for MPI applications (default:
disabled)
--enable-progress-threads
On Sep 11, 2008, at 2:38 PM, Eric Thibodeau wrote:
In short:
Which of the 3 options is the one known to be unstable in the
following:
--enable-mpi-threadsEnable threads for MPI applications (default:
disabled)
--enable-progress-threads
The two configuration options that are disabled by default (--enable-
mpi-threads and --enable-progress-threads) are both known unstable
The runtime listen_thread option is quite different and is known safe.
Ralph
On Sep 11, 2008, at 12:38 PM, Eric Thibodeau wrote:
Jeff,
In short:
Which
Jeff,
In short:
Which of the 3 options is the one known to be unstable in the following:
--enable-mpi-threadsEnable threads for MPI applications (default:
disabled)
--enable-progress-threads
Enable threads asynchronous communication
Jeff Squyres wrote:
I'm not sure what USE=-threads means, but I would discourage the use
of threads in the v1.2 series; our thread support is pretty much
broken in the 1.2 series.
That's exactly what it means, hence the following BFW I had originally
inserted in the package to this effect:
Prasanna Ranganathan wrote:
Hi Eric,
Thanks a lot for the reply.
I am currently working on upgrading to 1.2.7
I do not quite follow your directions; What do you refer to when you say say
"try with USE=-threads..."
I am referring to the USE variable which is used to set global package
Hi,
I have upgraded to 1.2.7 and am still noticing the issue.
Kindly help.
>
> Message: 1
> Date: Mon, 8 Sep 2008 16:43:33 -0400
> From: Jeff Squyres
> Subject: Re: [OMPI users] Need help resolving No route to host error
> withOpenMPI 1.1.2
> To: Open MPI Users
Hi Eric,
Thanks a lot for the reply.
I am currently working on upgrading to 1.2.7
I do not quite follow your directions; What do you refer to when you say say
"try with USE=-threads..."
Kindly excuse if it is a silly question and pardon my ignorance :D
Regards,
Prasanna.
Prasanna, also make sure you try with USE=-threads ...as the ebuild
states, it's _experimental_ ;)
Keep your eye on:
https://svn.open-mpi.org/trac/ompi/wiki/ThreadSafetySupport
Eric
Prasanna Ranganathan wrote:
Hi,
I have upgraded my openMPI to 1.2.6 (We have gentoo and emerge showed
Prasanna Ranganathan wrote:
Hi,
I have upgraded my openMPI to 1.2.6 (We have gentoo and emerge showed
1.2.6-r1 to be the latest stable version of openMPI).
Prasanna, do a sync, 1.2.7 is in portage and report back.
Eric
I do still get the following error message when running my test
Hi,
I have upgraded my openMPI to 1.2.6 (We have gentoo and emerge showed
1.2.6-r1 to be the latest stable version of openMPI).
I do still get the following error message when running my test helloWorld
program:
[10.12.77.21][0,1,95][btl_tcp_endpoint.c:572:mca_btl_tcp_endpoint_complete_c
Hi Jeff/Paul,
Thanks a lot for your replies.
I am looking into upgrading MPI to a newer version. As I use a few custom
built libraries as part of my main parallel application that recommend the
use of 1.1.2, I first need to check compatibility issues with the newer
version before I can
Hi,
First, consider to update to newer OpenMPI.
Second, look on your environment on the box you startts OpenMPI (runs
mpirun ...).
Type
ulimit -n
to explore how many file descriptors your envirinment have. (ulimit -a
for all limits). Note, every process on older versions of OpenMPI (prior
Hi,
I am trying to run a test mpiHelloWorld program that simply initializes the
MPI environment on all the nodes, prints the hostname and rank of each node
in the MPI process group and exits.
I am using MPI 1.1.2 and am running 997 processes on 499 nodes (Nodes have 2
dual core CPUs).
I get the
25 matches
Mail list logo