Hey Guys,
Not sure what is going on here, has anyone seen this before?
- Galen
Hi Galen,
Sorry to bother you.
I have installed latest stable version of Open MPI(1.0) on two of
spider
nodes(s7,s4) for some experiments, but there seems to be configuration
error or something else which I
Note that this is also a problem in the other BTLs, I will be looking
at them next... so don't close the ticket.. I will modify it to
indicate the other BTL's..
On May 25, 2006, at 10:57 AM, gship...@osl.iu.edu wrote:
Author: gshipman
Date: 2006-05-25 12:57:14 EDT (Thu, 25 May 2006)
New Re
On Jun 2, 2006, at 5:55 PM, Jonathan Day wrote:
Hi,
I'm working on developing some components for OpenMPI,
but am a little unclear as to how to implement
efficient sends and receives. I'm wanting to do
zero-copy two-sided MPI, but as far as I can see, this
is not going to be easy. As best as I
Hey Owen,
Taking this on list..
If I run on n249 orte just hangs waiting for completion of the send.
If I run on n248 I get:
[ompi@node-192-168-111-248 ~]$ mpirun -np 1 -mca btl self,openib ./ring
Signal:11 info.si_errno:0(Success) si_code:1(SEGV_MAPERR)
Failing at addr:0x10
[0] func:/home/om
0x04336dc8 in .__libc_start_main () from /lib64/libc.so.6
#17 0x00000000 in ?? ()
On Jun 29, 2006, at 2:33 PM, Galen M. Shipman wrote:
Hey Owen,
Taking this on list..
If I run on n249 orte just hangs waiting for completion of the send.
If I run on n248 I get:
[ompi@node
Okay,
this was a build system issue in change CFLAGS... worked that out,
and then found the real problem... using size_t with mca params
instead of an int... Fix coming shortly..
On Jun 29, 2006, at 2:47 PM, Galen M. Shipman wrote:
More info:
Two cores are generated
mpirun:
(gdb) bt
Open MPI. I
would point you toward QLogic support to obtain this library.
Thanks,
Galen M. Shipman
Los Alamos National Labs
Mike Aho wrote:
I cannot find psm.h which header file mtl_psm.h calls out in ompi v1.2
12372. Any hints on where I would get that? Thanks.
--Mike
Michael E. Aho
Hi Troy,
Tim and I would like to discuss this with you as well. One thing I
would ask, are you using the btl_mvapi_leave_pinned=1 option?
otherwise it is not a apples to apples comparison.
- Galen
On Aug 24, 2005, at 8:21 PM, Troy Benjegerdes wrote:
I have some Netpipe graphs of OpenMPI a
On Aug 31, 2005, at 1:06 PM, Jeff Squyres wrote:
On Aug 29, 2005, at 9:17 PM, Brad Penoff wrote:
PML: Pretty much the same as it was described in the paper. Its
interface is basically MPI semantics (i.e., it sits right under
MPI_SEND and the rest).
BTL: Byte Transfer Layer; it's the next g
Looking into it now.. looks like a type or two..
On Sep 13, 2005, at 1:50 PM, Nathan DeBardeleben wrote:
Compiling I get:
gcc -DHAVE_CONFIG_H -I. -I. -I../../../../include
-I../../../../include -I../../../../include -I../../../..
-I../../../.. -I../../../../include -I../../../../opal
-I../
thanks george, I didn't get a chance to test this from yesterday's
merge, I will do so and commit any other needed changes..
On Sep 13, 2005, at 2:18 PM, George Bosilca wrote:
Please update again (rev 7352). I run on the same problems yesterday
when I compile on thor, but I didn't commit as
Hi Gleb,
Tim and I have incorporated some of the changes you mentioned into a
new rcache framework. Currently there is a single component in this
framework, rcache_rb which is a registration cache based on a red-
black tree and uses an MRU list if the registration is not
persistent. Note
Gleb,
Gleb Natapov wrote:
Hello Galen,
Finally I've got some time to look through the new code.
I have couple of notes. In pml_ob1_rdma.c you try to merge
registrations in the number of places. The code looks like this:
btl_mpool->mpool_deregister(btl_mpool, reg);
btl_mpool->mpool_regis
Well, after adding a bunch of debugging output, I have found the
following.
With both leave_pinned and use_mem_hook enabled on a linpack run we
get the assertion error on the memory callback in linpack. That is to
say, there is a free occurring in the middle of a registration.
At the poi
10:58 AM, Galen M. Shipman wrote:
Well, after adding a bunch of debugging output, I have found the
following.
With both leave_pinned and use_mem_hook enabled on a linpack run we
get the assertion error on the memory callback in linpack. That is
to say, there is a free occurring in
When running the NPB - FT using 128 nodes problem size C, I get the
following error with both btl_tcp and btl_mvapi:
-bash-3.00$ mpirun -np 128 -machinefile ~/dqlist -mca btl self,tcp -
mca mpi_leave_pinned 0 ./bin/ft.C.128
NAS Parallel Benchmarks 2.3 -- FT Benchmark
No input file inputft
Gleb
Yes, yes it is.. This was causing RNR NAK problems!
Thanks,
Galen
On Oct 31, 2005, at 9:57 AM, Gleb Natapov wrote:
Index: ompi/mca/btl/openib/btl_openib_component.c
===
--- ompi/mca/btl/openib/btl_openib_component.c(r
While this change is technically correct and should remain in the
trunk, in both of these cases we are only using the convertor to get
the address of the users buffer. The data is contiguous in both these
cases so the size should not be changed by the convertor, as far as I
know. The short
On a fresh co of the trunk, after a successful autogen.sh I get the
following error with this configure:
./configure CC=pgcc CXX=pgCC F77=pgf77 FC=pgf90 --disable-io-romio -
with-mvapi=/usr/local/ib --enable-static --disable-shared --prefix=/u/
gshipman/myapps
*** Initialization, setup
co
Hi Andrew,
I am not able to replicate this on odin with 16 nodes using the trunk
or the v1.0 branch. How many nodes where you running with?
Thanks,
Galen
On Nov 23, 2005, at 5:46 PM, Andrew Friedley wrote:
I'm running the intel test suite against ompi revision r8247 (v1.0
branch), and t
:42 PM, Galen M. Shipman wrote:
Hi Andrew,
I am not able to replicate this on odin with 16 nodes using the
trunk or the v1.0 branch. How many nodes where you running with?
Thanks,
Galen
On Nov 23, 2005, at 5:46 PM, Andrew Friedley wrote:
I'm running the intel test suite against
On Mon, 5 Dec 2005, Gleb Natapov wrote:
This is because there is no "mpool_base" mca (see patch).
This looks good, will apply.
Also there is a code commented out that enables memory hooks if
leave_pinned is set. Why this code is disabled? Infiniband will
not work correctly in such setup.
Ther
On Thu, 8 Dec 2005, Gleb Natapov wrote:
On Wed, Dec 07, 2005 at 10:40:51AM -0500, Brian Barrett wrote:
Hopefully this made some sense. If not, on to the next round of e-
mails :).
This made allot of sense. What is compiled by default now is malloc_hooks
I'll compile ptmalloc and play with it
Gleb,
I will take a look and get this into the trunk.
Thanks,
Galen
On Feb 5, 2006, at 8:14 AM, Gleb Natapov wrote:
Hello,
btl_openib_reg_mru_len parameter is not propagated to rcache in
current
trunk. I can control mru list length with rcache_rb_mru_len parameter,
but this parameter is
Hi Leslie,
To start I would try running the Intel test suite, you can find this
here: http://www-unix.mcs.anl.gov/mpi/mpi-test/tsuite.html there are
also several other test suites available on this site, this will give
you correctness first.
For benchmarks you may try the NAS Parallel Benc
25 matches
Mail list logo