[OMPI devel] BW benchmark hangs after r 18551

2008-06-17 Thread Lenny Verkhovsky
Hi, George, I have a problem running BW benchmark on 100 rank cluster after r18551. The BW is mpi_p that runs mpi_bandwidth with 100K between all pairs. #mpirun -np 100 -hostfile hostfile_w ./mpi_p_18549 -t bw -s 10 BW (100) (size min max avg) 10 576.734030 2001.882416 1062.69

Re: [OMPI devel] BW benchmark hangs after r 18551

2008-06-17 Thread George Bosilca
Lenny, I guess you're running the latest version. If not, please update, Galen and myself corrected some bugs last week. If you're using the latest (and greatest) then ... well I imagine there is at least one bug left. There is a quick test you can do. In the btl_sm.c in the module stru

Re: [OMPI devel] BW benchmark hangs after r 18551

2008-06-17 Thread Lenny Verkhovsky
It seems like we have 2 bugs here. 1. After commiting NUMA awareness we see seqf 2. Before commiting NUMA r18656 we see application hangs. 3. I checked both it with and without sendi, same results. 4. It hangs most of the times, but sometimes large msg ( >1M ) are working. I will keep investigati

[OMPI devel] iprobe and opal_progress

2008-06-17 Thread Terry Dontje
I've ran into an issue while running hpl where a message has been sent (in shared memory in this case) and the receiver calls iprobe but doesn't see said message the first call to iprobe (even though it is there) but does see it the second call to iprobe. Looking at mca_pml_ob1_iprobe function

[OMPI devel] RML Send

2008-06-17 Thread Leonardo Fialho
Hi All, I´m using RML to send log messages from a PML to a ORTE daemon (located in another node). I got success sending the message header, but now I need to send the message data (buffer). How can I do it? The problem is what data type I need to use for packing/unpacking? I tried OPAL_DATA_V

Re: [OMPI devel] RML Send

2008-06-17 Thread Ralph H Castain
I'm not sure exactly how you are trying to do this, but the usual procedure would be: 1. call opal_dss.pack(*buffer, *data, #data, data_type) for each thing you want to put in the buffer. So you might call this to pack a string: opal_dss.pack(*buffer, &string, 1, OPAL_STRING); 2. once you have e

[OMPI devel] Open MPI v1.2.7rc1 has been posted

2008-06-17 Thread Tim Mattox
Hi All, The first release candidate of Open MPI v1.2.7 is now available: http://www.open-mpi.org/software/ompi/v1.2/ Please run it through it's paces as best you can. -- Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/ tmat...@gmail.com || timat...@open-mpi.org I'm a bright... http://www.

Re: [OMPI devel] RML Send

2008-06-17 Thread Leonardo Fialho
Hi Ralph, 1) Yes, I'm using ORTE_RML_TAG_DAEMON with a new "command" that I defined in "odls_types.h". 2) I'm packing and unpacking variables like OPAL_INT, OPAL_SIZE, ... 3) I'm not blocking the "process_commands" function with long code. 4) To know the daemon's vpid and jobid I used the same

Re: [OMPI devel] RML Send

2008-06-17 Thread Ralph Castain
On 6/17/08 3:35 PM, "Leonardo Fialho" wrote: > Hi Ralph, > > 1) Yes, I'm using ORTE_RML_TAG_DAEMON with a new "command" that I > defined in "odls_types.h". > 2) I'm packing and unpacking variables like OPAL_INT, OPAL_SIZE, ... > 3) I'm not blocking the "process_commands" function with long co