[OMPI devel] Pack data mismatch in file dps_unpack.c 95/121

2006-04-20 Thread Galen M. Shipman
Hey Guys, Not sure what is going on here, has anyone seen this before? - Galen Hi Galen, Sorry to bother you. I have installed latest stable version of Open MPI(1.0) on two of spider nodes(s7,s4) for some experiments, but there seems to be configuration error or something else which I

Re: [OMPI devel] [OMPI svn] svn:open-mpi r10072

2006-05-25 Thread Galen M. Shipman
Note that this is also a problem in the other BTLs, I will be looking at them next... so don't close the ticket.. I will modify it to indicate the other BTL's.. On May 25, 2006, at 10:57 AM, gship...@osl.iu.edu wrote: Author: gshipman Date: 2006-05-25 12:57:14 EDT (Thu, 25 May 2006) New Re

Re: [OMPI devel] Query on zero-copy sends

2006-06-05 Thread Galen M. Shipman
On Jun 2, 2006, at 5:55 PM, Jonathan Day wrote: Hi, I'm working on developing some components for OpenMPI, but am a little unclear as to how to implement efficient sends and receives. I'm wanting to do zero-copy two-sided MPI, but as far as I can see, this is not going to be easy. As best as I

Re: [OMPI devel] [Fwd: [OMPI users] Error polling HP CQ on linux ppc64 w/Infiniband]

2006-06-29 Thread Galen M. Shipman
Hey Owen, Taking this on list.. If I run on n249 orte just hangs waiting for completion of the send. If I run on n248 I get: [ompi@node-192-168-111-248 ~]$ mpirun -np 1 -mca btl self,openib ./ring Signal:11 info.si_errno:0(Success) si_code:1(SEGV_MAPERR) Failing at addr:0x10 [0] func:/home/om

Re: [OMPI devel] [Fwd: [OMPI users] Error polling HP CQ on linux ppc64 w/Infiniband]

2006-06-29 Thread Galen M. Shipman
0x04336dc8 in .__libc_start_main () from /lib64/libc.so.6 #17 0x00000000 in ?? () On Jun 29, 2006, at 2:33 PM, Galen M. Shipman wrote: Hey Owen, Taking this on list.. If I run on n249 orte just hangs waiting for completion of the send. If I run on n248 I get: [ompi@node

Re: [OMPI devel] [Fwd: [OMPI users] Error polling HP CQ on linux ppc64 w/Infiniband]

2006-06-29 Thread Galen M. Shipman
Okay, this was a build system issue in change CFLAGS... worked that out, and then found the real problem... using size_t with mca params instead of an int... Fix coming shortly.. On Jun 29, 2006, at 2:47 PM, Galen M. Shipman wrote: More info: Two cores are generated mpirun: (gdb) bt

Re: [OMPI devel] [openib-general] psm.h not found

2006-10-31 Thread Galen M. Shipman
Open MPI. I would point you toward QLogic support to obtain this library. Thanks, Galen M. Shipman Los Alamos National Labs Mike Aho wrote: I cannot find psm.h which header file mtl_psm.h calls out in ompi v1.2 12372. Any hints on where I would get that? Thanks. --Mike Michael E. Aho

Re: [O-MPI devel] OpenIB results

2005-08-24 Thread Galen M. Shipman
Hi Troy, Tim and I would like to discuss this with you as well. One thing I would ask, are you using the btl_mvapi_leave_pinned=1 option? otherwise it is not a apples to apples comparison. - Galen On Aug 24, 2005, at 8:21 PM, Troy Benjegerdes wrote: I have some Netpipe graphs of OpenMPI a

Re: [O-MPI devel] pml vs bml vs btl

2005-08-31 Thread Galen M. Shipman
On Aug 31, 2005, at 1:06 PM, Jeff Squyres wrote: On Aug 29, 2005, at 9:17 PM, Brad Penoff wrote: PML: Pretty much the same as it was described in the paper. Its interface is basically MPI semantics (i.e., it sits right under MPI_SEND and the rest). BTL: Byte Transfer Layer; it's the next g

Re: [O-MPI devel] OMPI compile failing

2005-09-13 Thread Galen M. Shipman
Looking into it now.. looks like a type or two.. On Sep 13, 2005, at 1:50 PM, Nathan DeBardeleben wrote: Compiling I get: gcc -DHAVE_CONFIG_H -I. -I. -I../../../../include -I../../../../include -I../../../../include -I../../../.. -I../../../.. -I../../../../include -I../../../../opal -I../

Re: [O-MPI devel] OMPI compile failing

2005-09-13 Thread Galen M. Shipman
thanks george, I didn't get a chance to test this from yesterday's merge, I will do so and commit any other needed changes.. On Sep 13, 2005, at 2:18 PM, George Bosilca wrote: Please update again (rev 7352). I run on the same problems yesterday when I compile on thor, but I didn't commit as

[O-MPI devel] Registration Cache changes

2005-09-14 Thread Galen M. Shipman
Hi Gleb, Tim and I have incorporated some of the changes you mentioned into a new rcache framework. Currently there is a single component in this framework, rcache_rb which is a registration cache based on a red- black tree and uses an MRU list if the registration is not persistent. Note

Re: [O-MPI devel] Registration Cache changes

2005-09-21 Thread Galen M. Shipman
Gleb, Gleb Natapov wrote: Hello Galen, Finally I've got some time to look through the new code. I have couple of notes. In pml_ob1_rdma.c you try to merge registrations in the number of places. The code looks like this: btl_mpool->mpool_deregister(btl_mpool, reg); btl_mpool->mpool_regis

[O-MPI devel] p2p linpack ---

2005-09-25 Thread Galen M. Shipman
Well, after adding a bunch of debugging output, I have found the following. With both leave_pinned and use_mem_hook enabled on a linpack run we get the assertion error on the memory callback in linpack. That is to say, there is a free occurring in the middle of a registration. At the poi

Re: [O-MPI devel] p2p linpack ---

2005-09-26 Thread Galen M. Shipman
10:58 AM, Galen M. Shipman wrote: Well, after adding a bunch of debugging output, I have found the following. With both leave_pinned and use_mem_hook enabled on a linpack run we get the assertion error on the memory callback in linpack. That is to say, there is a free occurring in

[O-MPI devel] NPB- FT errors

2005-10-11 Thread Galen M. Shipman
When running the NPB - FT using 128 nodes problem size C, I get the following error with both btl_tcp and btl_mvapi: -bash-3.00$ mpirun -np 128 -machinefile ~/dqlist -mca btl self,tcp - mca mpi_leave_pinned 0 ./bin/ft.C.128 NAS Parallel Benchmarks 2.3 -- FT Benchmark No input file inputft

Re: [O-MPI devel] [PATCH] casting is bad!

2005-10-31 Thread Galen M. Shipman
Gleb Yes, yes it is.. This was causing RNR NAK problems! Thanks, Galen On Oct 31, 2005, at 9:57 AM, Gleb Natapov wrote: Index: ompi/mca/btl/openib/btl_openib_component.c === --- ompi/mca/btl/openib/btl_openib_component.c(r

Re: [O-MPI devel] mvapi change

2005-11-22 Thread Galen M. Shipman
While this change is technically correct and should remain in the trunk, in both of these cases we are only using the convertor to get the address of the users buffer. The data is contiguous in both these cases so the size should not be changed by the convertor, as far as I know. The short

[O-MPI devel] PGI configure failure..

2005-11-25 Thread Galen M. Shipman
On a fresh co of the trunk, after a successful autogen.sh I get the following error with this configure: ./configure CC=pgcc CXX=pgCC F77=pgf77 FC=pgf90 --disable-io-romio - with-mvapi=/usr/local/ib --enable-static --disable-shared --prefix=/u/ gshipman/myapps *** Initialization, setup co

Re: [O-MPI devel] MPI_Probe_tag_c mvapi hand

2005-11-28 Thread Galen M. Shipman
Hi Andrew, I am not able to replicate this on odin with 16 nodes using the trunk or the v1.0 branch. How many nodes where you running with? Thanks, Galen On Nov 23, 2005, at 5:46 PM, Andrew Friedley wrote: I'm running the intel test suite against ompi revision r8247 (v1.0 branch), and t

Re: [O-MPI devel] MPI_Probe_tag_c mvapi hand

2005-11-29 Thread Galen M. Shipman
:42 PM, Galen M. Shipman wrote: Hi Andrew, I am not able to replicate this on odin with 16 nodes using the trunk or the v1.0 branch. How many nodes where you running with? Thanks, Galen On Nov 23, 2005, at 5:46 PM, Andrew Friedley wrote: I'm running the intel test suite against

Re: [O-MPI devel] [PATH] ompi_info doesn't show use_mem_hooks flag

2005-12-05 Thread Galen M. Shipman
On Mon, 5 Dec 2005, Gleb Natapov wrote: This is because there is no "mpool_base" mca (see patch). This looks good, will apply. Also there is a code commented out that enables memory hooks if leave_pinned is set. Why this code is disabled? Infiniband will not work correctly in such setup. Ther

Re: [O-MPI devel] [PATH] ompi_info doesn't show use_mem_hooks flag

2005-12-08 Thread Galen M. Shipman
On Thu, 8 Dec 2005, Gleb Natapov wrote: On Wed, Dec 07, 2005 at 10:40:51AM -0500, Brian Barrett wrote: Hopefully this made some sense. If not, on to the next round of e- mails :). This made allot of sense. What is compiled by default now is malloc_hooks I'll compile ptmalloc and play with it

Re: [O-MPI devel] btl_openib_reg_mru_len parameter

2006-02-06 Thread Galen M. Shipman
Gleb, I will take a look and get this into the trunk. Thanks, Galen On Feb 5, 2006, at 8:14 AM, Gleb Natapov wrote: Hello, btl_openib_reg_mru_len parameter is not propagated to rcache in current trunk. I can control mru list length with rcache_rb_mru_len parameter, but this parameter is

Re: [OMPI devel] MPI Applications

2006-03-05 Thread Galen M. Shipman
Hi Leslie, To start I would try running the Intel test suite, you can find this here: http://www-unix.mcs.anl.gov/mpi/mpi-test/tsuite.html there are also several other test suites available on this site, this will give you correctness first. For benchmarks you may try the NAS Parallel Benc