Do you know what r# of 1.6 you were trying to compile? Is this via the
tarball or svn?
thanks,
--td
On 7/30/2012 9:41 AM, Daniel Junglas wrote:
Hi,
I compiled OpenMPI 1.6 on a 64bit Solaris ultrasparc machine.
Compilation and installation worked without a problem. However,
when trying to
but I have
to make this program. Please have a look in a picture from link below,
maybe it will be more clear.
http://vipjg.nazwa.pl/sndfile_error.png
2012/7/30 TERRY DONTJE<terry.don...@oracle.com>:
On 7/30/2012 6:11 AM, Paweł Jaromin wrote:
Hello
Thanks for fast answer, but t
to the compiler.
This should give you an idea the difference between your gcc and mpicc
compilation. I would suspect either mpicc is using a compiler
significantly different than gcc or that mpicc might be passing some
optimization parameter that is messing the code execution (just a guess).
I hope,
I am not sure I am understanding the problem correctly so let me
describe it back to you with a couple clarifications.
So your program using sf_open compiles successfully when using gcc and
mpicc. However, when you run the executable compiled using mpicc
sndFile is null?
If the above is
On 6/16/2012 8:03 AM, Roland Schulz wrote:
Hi,
I would like to start a single process without mpirun and then use
MPI_Comm_spawn to start up as many processes as required. I don't want
the parent process to take up any resources, so I tried to disconnect
the inter communicator and then
a firewall issue is something
to look into.
--td
On 6/7/2012 6:36 AM, Duke wrote:
On 6/7/12 5:31 PM, TERRY DONTJE wrote:
Can you get on one of the nodes and see the job's processes? If so
can you then attach a debugger to it and get a stack? I wonder if
the processes are stuck in MPI_Init
Another sanity think to try is see if you can run your test program on
just one of the nodes? If that works more than likely MPI is having
issues setting up connections between the nodes.
--td
On 6/7/2012 6:06 AM, Duke wrote:
Hi again,
Somehow the verbose flag (-v) did not work for me. I
Can you get on one of the nodes and see the job's processes? If so can
you then attach a debugger to it and get a stack? I wonder if the
processes are stuck in MPI_Init?
--td
On 6/7/2012 6:06 AM, Duke wrote:
Hi again,
Somehow the verbose flag (-v) did not work for me. I tried
On 6/6/2012 4:38 AM, Siegmar Gross wrote:
Hello,
I compiled "openmpi-1.6" on "Solaris 10 sparc", "Solaris 10 x86",
and Linux (openSuSE 12.1) with "Sun C 5.12". Today I searched my
log-files for "WARNING" and found the following message.
WARNING:
This looks like a missing check in the sctp configure.m4. I am working
on a patch.
--td
On 6/5/2012 10:10 AM, Siegmar Gross wrote:
Hello,
I compiled "openmpi-1.6" on "Solaris 10 sparc" and "Solaris 10 x86"
with "gcc-4.6.2" and "Sun C 5.12". Today I searched my log-files for
"WARNING" and
BTW, the changes prior to r26496 failed some of the MTT test runs on
several systems. So if the current implementation is deemed not
"correct" I suspect we will need to figure out if there are changes to
the tests that need to be done.
See http://www.open-mpi.org/mtt/index.php?do_redir=2066
On 5/7/2012 8:40 PM, Jeff Squyres (jsquyres) wrote:
On May 7, 2012, at 8:31 PM, Jingcha Joba wrote:
So in the above stated example, end-start will be: +
20ms ?
(time slice of P2 + P3 = 20ms)
More or less (there's nonzero amount of time required for the kernel scheduler,
and the time
On 5/4/2012 1:17 PM, Don Armstrong wrote:
On Fri, 04 May 2012, Rolf vandeVaart wrote:
On Behalf Of Don Armstrong
On Thu, 03 May 2012, Rolf vandeVaart wrote:
2. If that works, then you can also run with a debug switch to
see what connections are being made by MPI.
You can see the
On 5/4/2012 8:26 AM, Rolf vandeVaart wrote:
2. If that works, then you can also run with a debug switch to see
what connections are being made by MPI.
You can see the connections being made in the attached log:
[archimedes:29820] btl: tcp: attempting to connect() to [[60576,1],2] address
On 4/25/2012 1:00 PM, Jeff Squyres wrote:
On Apr 25, 2012, at 12:51 PM, Ralph Castain wrote:
Sounds rather bizarre. Do you have lstopo on your machine? Might be useful to
see the output of that so we can understand what it thinks the topology is like
as this underpins the binding code.
To determine if an MPI process is waiting for a message do what Rayson
suggested and attach a debugger to the processes and see if any of them
are stuck in MPI. Either internally in a MPI_Recv or MPI_Wait call or
looping on a MPI_Test call.
Other things to consider.
Is this the first time
Do you get any interfaces shown when you run "ibstat" on any of the
nodes your job is spawned on?
--td
On 2/15/2012 1:27 AM, Tohiko Looka wrote:
Mm... This is really strange
I don't have that service and there is no ib* output in 'ifconfig -a'
or 'Infinband' in 'lspci'
Which makes me believe
ompi_info should tell you the current version of Open MPI your path is
pointing to.
Are you sure your path is pointing to the area that the OpenFOAM package
delivered Open MPI into?
--td
On 1/27/2012 5:02 AM, Brett Tully wrote:
Interesting. In the same set of updates, I installed OpenFOAM
Is there a way to set up an interface analogous to Unix's loopback? I
suspect setting "-mca btl self,sm" wouldn't help since this is probably
happening while the processes are bootstrapping.
--td
On 1/16/2012 7:26 PM, Ralph Castain wrote:
The problem is that OMPI is looking for a tcp port
Do you have a stack of where exactly things are seg faulting in
blacs_pinfo?
--td
On 1/13/2012 8:12 AM, Conn ORourke wrote:
Dear Openmpi Users,
I am reserving several processors with SGE upon which I want to run a
number of openmpi jobs, all of which individually (and combined) use
less
I am a little confused by your problem statement. Are you saying you
want to have each MPI process to have multiple threads that can call MPI
concurrently? If so you'll want to read up on the MPI_Init_thread
function.
--td
On 1/11/2012 7:19 AM, Hamilton Fischer wrote:
Hi, I'm actually
, Dec 15, 2011, at 07:00 AM, TERRY DONTJE wrote:
IIRC, RNR's are usually due to the receiving side not having a segment
registered and ready to receive data on a QP. The btl does go through a
big dance and does its own flow control to make sure this doesn't happen.
So when this happens are both
IIRC, RNR's are usually due to the receiving side not having a segment
registered and ready to receive data on a QP. The btl does go through a
big dance and does its own flow control to make sure this doesn't happen.
So when this happens are both the sending and receiving nodes using
mthca's
Are all the other processes gone? What version of OMPI are you using?
On 11/28/2011 9:00 AM, Mudassar Majeed wrote:
Dear people,
In my MPI application, all the processes call
the MPI_Finalize (all processes reach there) but the rank 0 process
could not finish with
On 11/23/2011 2:02 PM, Paul Kapinos wrote:
Hello Ralph, hello all,
Two news, as usual a good and a bad one.
The good: we believe to find out *why* it hangs
The bad: it seem for me, this is a bug or at least undocumented
feature of Open MPI /1.5.x.
In detail:
As said, we see mystery
David, are you saying your jobs consistently leave behind session files
after the job exits? It really shouldn't even in the case when a job
aborts, I thought, mpirun took great pains to cleanup after itself.
Can you tell us what version of OMPI you are running with? I think I
could see
Sorry please disregard my reply to this email.
:-)
--td
On 10/26/2011 10:44 AM, Ralph Castain wrote:
Did the version you are running get installed in /usr? Sounds like you are
picking up a different version when running a command - i.e., that your PATH is
finding a different installation
I am using prefix configuration so no it does not exist in /usr.
--td
On 10/26/2011 10:44 AM, Ralph Castain wrote:
Did the version you are running get installed in /usr? Sounds like you are
picking up a different version when running a command - i.e., that your PATH is
finding a different
This looks more like a seg fault in wrf and not OMPI.
Sorry not much I can do here to help you.
--td
On 10/25/2011 9:53 AM, Mouhamad Al-Sayed-Ali wrote:
Hi again,
This is exactly the error I have:
taskid: 0 hostname: part034.u-bourgogne.fr
[part034:21443] *** Process received signal
Can you run wrf successfully on one node?
Can you run a simple code across your two nodes? I would try hostname
then some simple MPI program like the ring example.
--td
On 10/25/2011 9:05 AM, Mouhamad Al-Sayed-Ali wrote:
hello,
-What version of ompi are you using
I am using ompi
Some more info would be nice like:
-What version of ompi are you using
-What type of machine and os are you running on
-What does the machine file look like
-Is there a stack trace left behind by the pid that seg faulted?
--td
On 10/25/2011 8:07 AM, Mouhamad Al-Sayed-Ali wrote:
Hello,
I have
h here ...!!
P5>> Both ...!! P3
P13>> Sender only ...!! P4
P13>> I could reach here ...!!
P6>> Both ...!! P5
P7>> Neither ...!!
P7>> I could reach here ...!!
P14>> I could reach here ...!!
P1>> Received from P7, packet contains rank: 11
On 7/15/2011 1:46 PM, Paul Kapinos wrote:
Hi OpenMPI volks (and Oracle/Sun experts),
we have a problem with Sun's MPI (Cluster Tools 8.2.x) on a part of
our cluster. In the part of the cluster where LDAP is activated, the
mpiexec does not try to spawn tasks on remote nodes at all, but
ving sides are captured on
the basis of MPI_ANY_SOURCE, that seems like it does not see the
destination of message while capturing it from message queue of the
MPI system.
regards,
Mudassar
*From:* Terry Dontje <terry.don...@
of the problem it is going to be nearly impossible
for us to tell you what is wrong.
--td
regards,
Mudassar
Date: Fri, 15 Jul 2011 07:04:34 -0400
From: Terry Dontje <terry.don...@oracle.com
<mailto:terry.don...@oracle.com>>
Subject: Re: [OMPI users] Urgent Question regarding, MPI_ANY_SO
in status.MPI_SOURCE, but it is different than
expected. I need to receive that message which was sent to me, not any
message.
regards,
Date: Fri, 15 Jul 2011 06:33:41 -0400
From: Terry Dontje <terry.don...@oracle.com
<mailto:terry.don...@oracle.com>>
Subject: Re: [OMPI users] Urgent Questi
Here's, hopefully, more useful info. Note reading the job2core.pdf
presentation, that was mentioned earlier, more closely will also
clarify a couple points (I've put those points inline below).
On 7/15/2011 12:01 AM, Ralph Castain wrote:
On Jul 14, 2011, at 5:46 PM, Jeff Squyres wrote:
Mudassar,
You can do what you are asking. The receiver uses MPI_ANY_SOURCE for
the source rank value and when you receive a message the
status.MPI_SOURCE will contain the rank of the actual sender not the
receiver's rank. If you are not seeing that then there is a bug somewhere.
--td
On
---
Regards,
Robert Walters
*From:*users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org]
*On Behalf Of *Terry Dontje
*Sent:* Monday, May 02, 2011 2:50 PM
*To:* us...@open-mpi.org
*Su
pi.org [mailto:users-boun...@open-mpi.org]
*On Behalf Of *Terry Dontje
*Sent:* Monday, May 02, 2011 2:50 PM
*To:* us...@open-mpi.org
*Subject:* Re: [OMPI users] OpenMPI LS-DYNA Connection refused
On 05/02/2011 02:04 PM, Robert Walters wrote:
Terry,
I was under the impression that all connecti
On 05/02/2011 02:04 PM, Robert Walters wrote:
Terry,
I was under the impression that all connections are made because of
the nature of the program that OpenMPI is invoking. LS-DYNA is a
finite element solver and for any given simulation I run, the cores on
each node must constantly
On 05/02/2011 11:30 AM, Jack Bryan wrote:
Thanks for your reply.
MPI is for academic purpose. How about business applications ?
There are quite a bit of non-academic MPI applications. For example
there are quite a bit of simulation codes from different vendors that
support MPI (Nastran is
On 04/30/2011 08:52 PM, Jack Bryan wrote:
Hi, All:
What is the relationship between MPI communication and socket
communication ?
MPI may use socket communications to do communications between two
processes. Aside from that they are used for different purposes.
Is the network socket
Paul and I have been talking about the below issue and I thought it
would be useful to update the list just in case someone else runs into
this problem and ends up searching the email list before we actually fix
the issue.
The problem is OMPI's configure tests to see if -lm is needed to get
On 04/07/2011 08:36 AM, Paul Kapinos wrote:
Hi Terry,
so, the attached ceil.c example file *can* be compiled by "CC" (the
Studio C++ compiler), but *cannot* be compiled using "cc" (the
Studio C compiler).
$ CC ceil.c
$ cc ceil.c
Did you try to link in the math library -lm? When I did this
On 04/07/2011 06:16 AM, Paul Kapinos wrote:
Dear OpenMPI developers,
We tried to build OpenMPI 1.5.3 including Support for Platform LSF
using the Sun Studio (=Oracle Solaris Studio now) /12.2 and the
configure stage failed.
1. Used flags:
./configure --with-lsf --with-openib
directory. none of those
are equivilant becasue they are all linked with vampire trace if I am
reading the names right. I've already tried putting
/opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libvt.mpi.a for this and it didnt
work giving errors like
On Wed, Apr 6, 2011 at 12:42 PM, Terry Dontje <terry.
Something looks fishy about your numbers. The first two sets of numbers
look the same and the last set do look better for the most part. Your
mpirun command line looks weird to me with the "-mca
orte_base_help_aggregate btl,openib,self," did something get chopped off
with the text copy? You
I am not sure Fedora comes with Open MPI installed on it by default (at
least my FC13 did not). You may want to look at trying to install the
Open MPI from yum or some other package mananger. Or you can download
the source tarball from http://www.open-mpi.org/software/ompi/v1.4/,
build and
It was asked during the community concall whether the below may be
related to ticket #2722 https://svn.open-mpi.org/trac/ompi/ticket/2722?
--td
On 04/04/2011 10:17 PM, David Zhang wrote:
Any error messages? Maybe the nodes ran out of memory? I know MPI
implement some kind of buffering under
On 04/05/2011 05:11 AM, SLIM H.A. wrote:
After an upgrade of our system I receive the following error message
(openmpi 1.4.2 with gridengine):
quote
--
Sorry! You were supposed to get help about:
libfui.so is a library a part of the Solaris Studio FORTRAN tools. It
should be located under lib from where your Solaris Studio compilers are
installed from. So one question is whether you actually have Studio
Fortran installed on all your nodes or not?
--td
On 04/04/2011 04:02 PM, Ralph
Dave what version of Grid Engine are you using?
The plm checks for the following env-var's to determine if you are
running Grid Engine.
SGE_ROOT
ARC
PE_HOSTFILE
JOB_ID
If these are not there during the session that mpirun is executed then
it will resort to ssh.
--td
On 03/21/2011 08:24
On 03/17/2011 03:31 PM, vaibhav dutt wrote:
Hi,
Thanks for your reply. I tried to execute first a process by using
mpirun -machinefile hostfile.txt --slot-list 0:1 -np 1
but it gives the same as error as mentioned previously.
Then, I created a rankfile with contents"
rank 0=t1.tools.xxx
On 03/17/2011 03:31 PM, vaibhav dutt wrote:
Hi,
Thanks for your reply. I tried to execute first a process by using
mpirun -machinefile hostfile.txt --slot-list 0:1 -np 1
but it gives the same as error as mentioned previously.
Then, I created a rankfile with contents"
rank 0=t1.tools.xxx
Of *Terry Dontje
*Sent:* Wednesday, February 09, 2011 5:02 PM
*To:* us...@open-mpi.org
*Subject:* Re: [OMPI users] Totalview not showing main program on
startup with OpenMPI 1.3.x and 1.4.x
This sounds like something I ran into some time ago that involved the
compiler omitting frame pointers
This sounds like something I ran into some time ago that involved the
compiler omitting frame pointers. You may want to try to compile your
code with -fno-omit-frame-pointer. I am unsure if you may need to do
the same while building MPI though.
--td
On 02/09/2011 02:49 PM, Dennis McRitchie
On 02/01/2011 07:34 PM, Jeff Squyres wrote:
On Feb 1, 2011, at 5:02 PM, Jeffrey A Cummings wrote:
I'm getting a lot of push back from the SysAdmin folks claiming that OpenMPI is
closely intertwined with the specific version of the operating system and/or
other system software (i.e., Rocks on
So are you trying to start an mpi job that one process is one executable
and the other process(es) are something else? If so, you probably want
to use a multiple app context. If you look at FAQ question 7. How do I
run an MPMD MPI Job at http://www.open-mpi.org/faq/?category=running
this
On 12/10/2010 03:24 PM, David Mathog wrote:
Ashley Pittman wrote:
For a much simpler approach you could also use these two environment
variables, this is on my current system which is 1.5 based, YMMV of course.
OMPI_COMM_WORLD_LOCAL_RANK
OMPI_COMM_WORLD_LOCAL_SIZE
However that doesn't really
On 12/10/2010 01:46 PM, David Mathog wrote:
The master is commonly very different from the workers, so I expected
there would be something like
--rank0-on
but there doesn't seem to be a single switch on mpirun to do that.
If "mastermachine" is the first entry in the hostfile, or the first
ust hate to see such a complex, time-consuming method when the info
is already available on every process.
On Dec 10, 2010, at 3:36 AM, Terry Dontje wrote:
A more portable way of doing what you want below is to gather each
processes processor_name given by MPI_Get_processor_name, have the
root who
I am not sure this has anything to do with your problem but if you look
at the stack entry for PMPI_Recv I noticed the buf has a value of 0.
Shouldn't that be an address?
Does your code fail if the MPI library is built with -g? If it does
fail the same way, the next step I would do would be
Ticket 2632 really spells out what the issue is.
On 11/30/2010 10:23 AM, Prentice Bisbal wrote:
Nehemiah Dacres wrote:
that looks about right. So the suggestion:
./configure LDFLAGS="-notpath ... ... ..."
-notpath should be replaced by whatever the proper flag should be, in my case
-L ?
A slight note for the below there should be a space between "ld" and the
ending single quote mark so it should be '-Qoption ld ' not '-Qoption ld'
--td
On 11/30/2010 06:31 AM, Terry Dontje wrote:
Actually there is a way to modify the configure file that will not
require the
e them all to '-Qoption ld' and then do the configure things
should work.
Good luck,
--td
On 11/30/2010 06:19 AM, Terry Dontje wrote:
On 11/29/2010 05:41 PM, Nehemiah Dacres wrote:
thanks.
FYI: its openmpi-1.4.2 from a tarball like you assume
I changed this line
*Sun\ F* | *Sun*Fortran*)
This is ticket 2632 https://svn.open-mpi.org/trac/ompi/ticket/2632. A
fix has been put into the trunk last week. We should be able to CMR
this fix to the 1.5 and 1.4 branches later this week.The ticket
actually has a workaround for 1.5 branch.
--td
On 11/29/2010 09:46 AM, Siegmar Gross
On 11/22/2010 08:18 PM, Paul Monday (Parallel Scientific) wrote:
This is a follow-up to an earlier question, I'm trying to understand how --mca
btl prioritizes it's choice for connectivity. Going back to my original
network, there are actually two networks running around. A point to point
You're gonna have to use a protocol that can route through a machine,
OFED User Verbs (ie openib) does not do this. The only way I know of to
do this via OMPI is with the tcp btl.
--td
On 11/22/2010 09:28 AM, Paul Monday (Parallel Scientific) wrote:
We've been using OpenMPI in a switched
Yes, I believe this solves the mystery. In short OGE and ORTE both
work. In the linear:1 case the job is exiting because there are not
enough resources for the orte binding to work, which actually makes
sense. In the linear:2 case I think we've proven that we are binding to
the right amount
Perhaps if someone could run this test again with --report-bindings
--leave-session-attached and provide -all- output we could verify that
analysis and clear up the confusion?
Yeah, however I bet you we still won't see output.
--td
On Wed, Nov 17, 2010 at 8:13 AM, Terry Dontje <terry.don.
and email flying around it would be nice to actually
see the output you mention.
--td
On Wed, Nov 17, 2010 at 7:51 AM, Terry Dontje <terry.don...@oracle.com
<mailto:terry.don...@oracle.com>> wrote:
On 11/17/2010 09:32 AM, Ralph Castain wrote:
Cris' output is coming solely
-attached is not required when
the OGE binding argument is not given.
--td
HTH
Ralph
On Wed, Nov 17, 2010 at 6:57 AM, Terry Dontje <terry.don...@oracle.com
<mailto:terry.don...@oracle.com>> wrote:
On 11/17/2010 07:41 AM, Chris Jewell wrote:
On 17 Nov 2010, at 11:56, Terry
On 11/17/2010 07:41 AM, Chris Jewell wrote:
On 17 Nov 2010, at 11:56, Terry Dontje wrote:
You are absolutely correct, Terry, and the 1.4 release series does include the
proper code. The point here, though, is that SGE binds the orted to a single
core, even though other cores are also
On 11/16/2010 08:24 PM, Ralph Castain wrote:
On Tue, Nov 16, 2010 at 12:23 PM, Terry Dontje
<terry.don...@oracle.com <mailto:terry.don...@oracle.com>> wrote:
On 11/16/2010 01:31 PM, Reuti wrote:
Hi Ralph,
Am 16.11.2010 um 15:40 schrieb Ralph Castain:
2. h
On 11/16/2010 01:31 PM, Reuti wrote:
Hi Ralph,
Am 16.11.2010 um 15:40 schrieb Ralph Castain:
2. have SGE bind procs it launches to -all- of those cores. I believe SGE does
this automatically to constrain the procs to running on only those cores.
This is another "bug/feature" in SGE: it's a
On 11/16/2010 12:13 PM, Chris Jewell wrote:
On 16 Nov 2010, at 14:26, Terry Dontje wrote:
In the original case of 7 nodes and processes if we do -binding pe linear:2,
and add the -bind-to-core to mpirun I'd actually expect 6 of the nodes
processes bind to one core and the 7th node with 2
On 11/16/2010 12:13 PM, Chris Jewell wrote:
On 16 Nov 2010, at 14:26, Terry Dontje wrote:
In the original case of 7 nodes and processes if we do -binding pe linear:2,
and add the -bind-to-core to mpirun I'd actually expect 6 of the nodes
processes bind to one core and the 7th node with 2
On 11/16/2010 10:59 AM, Reuti wrote:
Am 16.11.2010 um 15:26 schrieb Terry Dontje:
1. allocate a specified number of cores on each node to your job
this is currently the bug in the "slot<=> core" relation in SGE, which has to
be removed, updated or clarified. For now slo
On 11/16/2010 09:08 AM, Reuti wrote:
Hi,
Am 16.11.2010 um 14:07 schrieb Ralph Castain:
Perhaps I'm missing it, but it seems to me that the real problem lies in the interaction
between SGE and OMPI during OMPI's two-phase launch. The verbose output shows that SGE
dutifully allocated the
On 11/16/2010 04:26 AM, Chris Jewell wrote:
Hi all,
On 11/15/2010 02:11 PM, Reuti wrote:
Just to give my understanding of the problem:
Sorry, I am still trying to grok all your email as what the problem you
are trying to solve. So is the issue is trying to have two jobs having
processes on
On 11/15/2010 02:11 PM, Reuti wrote:
Just to give my understanding of the problem:
Am 15.11.2010 um 19:57 schrieb Terry Dontje:
On 11/15/2010 11:08 AM, Chris Jewell wrote:
Sorry, I am still trying to grok all your email as what the problem you
are trying to solve. So is the issue is trying
On 11/15/2010 11:08 AM, Chris Jewell wrote:
Sorry, I am still trying to grok all your email as what the problem you
are trying to solve. So is the issue is trying to have two jobs having
processes on the same node be able to bind there processes on different
resources. Like core 1 for the first
Sorry, I am still trying to grok all your email as what the problem you
are trying to solve. So is the issue is trying to have two jobs having
processes on the same node be able to bind there processes on different
resources. Like core 1 for the first job and core 2 and 3 for the 2nd job?
I am able to build on Linux systems with Sun C 5.11 using gcc-4.1.2.
Still trying to get a version of gcc 4.3.4 compiled on our systems so I
can use it with Sun C 5.11 to build OMPI.
--td
On 11/01/2010 05:58 AM, Siegmar Gross wrote:
Hi,
Sorry, but can you give us the config line, the
Sorry, but can you give us the config line, the config.log and the
full output of make preferrably with make V=1?
--td
On 10/29/2010 04:30 AM, Siegmar Gross wrote:
Hi,
I tried to build Open MPI 1.5 on Solaris X86 and x86_64 with Oracle
Studio 12.2. I can compile Open MPI with thread support,
So what you are saying is *all* the ranks have entered MPI_Finalize
and only a subset has exited per placing prints before and after
MPI_Finalize. Good. So my guess is that the processes stuck in
MPI_Finalize have a prior MPI request outstanding that for whatever
reason is unable to
When you do a make can your add a V=1 to have the actual compile lines
printed out. That will probably show you the line with
-fno-directives-only in it. Which is odd because I think that option is
a gcc'ism and don't know why it would show up in a studio build (note my
build doesn't show
On 10/21/2010 10:18 AM, Jeff Squyres wrote:
Terry --
Can you file relevant ticket(s) for v1.5 on Trac?
Once I have more information and have proven it isn't due to us using
old compilers or a compiler error itself.
--td
On Oct 21, 2010, at 10:10 AM, Terry Dontje wrote:
I've reproduced
st be left over cruft.
Note, my compiler hang disappeared on me. So maybe there was an
environmental issue on my side.
--td
On 10/21/2010 06:47 AM, Terry Dontje wrote:
On 10/21/2010 06:43 AM, Jeff Squyres (jsquyres) wrote:
Also, i'm not entirely sure what all the commands are that you ar
Sent from my PDA. No type good.
On Oct 21, 2010, at 6:25 AM, "Terry Dontje" <terry.don...@oracle.com
<mailto:terry.don...@oracle.com>> wrote:
I wonder if the error below be due to crap being left over in the
source tree. Can you do a "make clean". Note on a
I wonder if the error below be due to crap being left over in the
source tree. Can you do a "make clean". Note on a new checkout from
the v1.5 svn branch I was able to build 64 bit with the following
configure line:
../configure FC=f95 F77=f77 CC=cc CXX=CC --without-openib
--without-udapl
Can you remove the -with-threads and -enable-mpi-threads options from
the configure line and see if that helps your 32 bit problem any?
--td
On 10/20/2010 09:38 AM, Siegmar Gross wrote:
Hi,
I have built Open MPI 1.5 on Linux x86_64 with the Oracle/Sun Studio C
compiler. Unfortunately
On 10/05/2010 10:23 AM, Storm Zhang wrote:
Sorry, I should say one more thing about the 500 procs test. I tried
to run two 500 procs at the same time using SGE and it runs fast and
finishes at the same time as the single run. So I think OpenMPI can
handle them separately very well.
For the
relaxed ordering memory operations. If I remember
correct it was some IBM platform.
Do you know if relaxed memory ordering is enabled on your platform ? If it is
enabled you have to disable eager rdma.
Regards,
Pasha
On Sep 29, 2010, at 1:04 PM, Terry Dontje wrote:
Pasha, do you by any chance
Pasha, do you by any chance know who at Mellanox might be responsible
for OMPI working?
--td
Eloi Gaudry wrote:
Hi Nysal, Terry,
Thanks for your input on this issue.
I'll follow your advice. Do you know any Mellanox developer I may
discuss with, preferably someone who has spent some time
d enclosed the requested check outputs (using -output-filename
stdout.tag.null option).
I'm displaying frag->hdr->tag here.
Eloi
On Monday 27 September 2010 16:29:12 Terry Dontje wrote:
Eloi, sorry can you print out frag->hdr->tag?
Unfortunately from your last email I think it will s
mpi/mca/btl/openib/btl_openib_component.c::handle_wc in the
SEND/RDMA_WRITE case, but this is all I can think of alone.
You'll find a stacktrace (receive side) in this thread (10th or 11th
message) but it might be pointless.
Regards,
Eloi
On Monday 27 September 2010 11:43:55 Terry Dontje wrote:
So it soun
side to figure out what might make it generate at 0 hdr->tag.
Or maybe instrument the send side to stop when it is about ready to send
a 0 hdr->tag and see if we can see how the code got there.
I might have some cycles to look at this Monday.
--td
Eloi
On Friday 24 September 201
as hrd->tag = 0 in btl_openib_component.c:2881) yet.
Eloi
/home/pp_fr/st03230/EG/Softs/openmpi-custom-1.4.2/bin/
On Thursday 23 September 2010 23:33:48 Terry Dontje wrote:
Eloi, I am curious about your problem. Can you tell me what size of job
it is? Does it always fail on the s
1 - 100 of 169 matches
Mail list logo