I meant it works now, sorry for the confusion.
Running the test revealed a warning on memory registration, which we fixed
by setting unlimited in ulimit -l. Then running OMPI sample worked too.
Thank you,
saliya
On Sun, Dec 28, 2014 at 11:18 PM, Ralph Castain wrote:
> So
So you are saying the test worked, but you are still encountering an error when
executing an MPI job? Or are you saying things now work?
> On Dec 28, 2014, at 5:58 PM, Saliya Ekanayake wrote:
>
> Thank you Ralph. This produced the warning on memory limits similar to [1]
>
Thank you Ralph. This produced the warning on memory limits similar to [1]
and setting ulimit -l unlimited worked.
[1] http://lists.openfabrics.org/pipermail/general/2007-June/036941.html
Saliya
On Sun, Dec 28, 2014 at 5:57 PM, Ralph Castain wrote:
> Have the admin try
Have the admin try running the ibv_ud_pingpong test - that will exercise the
portion of the system under discussion.
> On Dec 28, 2014, at 2:31 PM, Saliya Ekanayake wrote:
>
> What I heard from the administrator is that,
>
> "The tests that work are the simple utilities
What I heard from the administrator is that,
"The tests that work are the simple utilities ib_read_lat and ib_read_bw
that measures latency and bandwith between two nodes. They are part of
the "perftest" repo package."
On Dec 28, 2014 10:20 AM, "Saliya Ekanayake" wrote:
>
This happens at MPI_Init. I've attached the full error message.
The sys admin mentioned Infiniband utility tests ran OK. I'll contact him
for more details and let you know.
Thank you,
Saliya
On Sun, Dec 28, 2014 at 3:18 AM, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:
> Where
Might also be worth checking to ensure that UD is enabled on your IB
installation as we depend upon it for wireup of IB connections.
> On Dec 28, 2014, at 12:18 AM, Gilles Gouaillardet
> wrote:
>
> Where does the error occurs ?
> MPI_Init ?
> MPI_Finalize ?
>
Where does the error occurs ?
MPI_Init ?
MPI_Finalize ?
In between ?
In the first case, the bug is likely a mishandled error case,
which means OpenMPI is unlikely the root cause of the crash.
Did you check infniband is up and running on your cluster ?
Cheers,
Gilles
Saliya Ekanayake
It's been a while on this, but we are still having trouble getting OpenMPI
to work with Infiniband on this cluster. We tried with latest 1.8.4 as
well, but it's still the same.
To recap, we get the following error when MPI initializes (in the simple
Hello world C example) with Infiniband.
Thank you Jeff, I'll try this and let you know.
Saliya
On Nov 10, 2014 6:42 AM, "Jeff Squyres (jsquyres)"
wrote:
> I am sorry for the delay; I've been caught up in SC deadlines. :-(
>
> I don't see anything blatantly wrong in this output.
>
> Two things:
>
> 1. Can you try
I am sorry for the delay; I've been caught up in SC deadlines. :-(
I don't see anything blatantly wrong in this output.
Two things:
1. Can you try a nightly v1.8.4 snapshot tarball? This will check to see if
whatever the bug is has been fixed for the upcoming release:
Hi Jeff,
You are probably busy, but just checking if you had a chance to look at
this.
Thanks,
Saliya
On Thu, Nov 6, 2014 at 9:19 AM, Saliya Ekanayake wrote:
> Hi Jeff,
>
> I've attached a tar file with information.
>
> Thank you,
> Saliya
>
> On Tue, Nov 4, 2014 at 4:18
Hi Jeff,
I've attached a tar file with information.
Thank you,
Saliya
On Tue, Nov 4, 2014 at 4:18 PM, Jeff Squyres (jsquyres)
wrote:
> Looks like it's failing in the openib BTL setup.
>
> Can you send the info listed here?
>
> http://www.open-mpi.org/community/help/
>
Looks like it's failing in the openib BTL setup.
Can you send the info listed here?
http://www.open-mpi.org/community/help/
On Nov 4, 2014, at 1:10 PM, Saliya Ekanayake wrote:
> Hi,
>
> I am using OpenMPI 1.8.1 in a Linux cluster that we recently setup. It builds
>
Hi Howard,
I just tried with 1.8.3. as well and it produces the same error. We have
another cluster where both versions work fine, which is why I was curious
as what kind of things could cause this.
Thank you,
Saliya
On Tue, Nov 4, 2014 at 1:31 PM, Howard Pritchard
wrote:
Hello Saliya,
Would you mind trying to reproduce the problem using the latest 1.8 release
- 1.8.3?
Thanks,
Howard
2014-11-04 11:10 GMT-07:00 Saliya Ekanayake :
> Hi,
>
> I am using OpenMPI 1.8.1 in a Linux cluster that we recently setup. It
> builds fine, but when I try to
Hi,
I am using OpenMPI 1.8.1 in a Linux cluster that we recently setup. It
builds fine, but when I try to run even the simplest hello.c program it'll
cause a segfault. Any suggestions on how to correct this?
The steps I did and error message are below.
1. Built OpenMPI 1.8.1 on the cluster. The
17 matches
Mail list logo