Hello all,I temporarily worked around my former problem by using synchronous communication and shifting the initialization
into the first call of a collective operation.
But nevertheless, I found a performance bug in btl_openib.When I execute the attached sendrecv.c on 4 (or more) nodes of a Pentium D Cluster with Infinniband, each receiving process gets only 8 messages in some seconds and then does nothing for at least 20 sec. (I executed the following command and hit Ctrl-C 20 sec. after the last output)
wassen@elrohir:~/src/mpi_test$ mpirun -np 4 -host pd-01,pd-02,pd-03,pd-04 -mca btl openib,self sendrecv
[3] received data[0]=1 [1] received data[0]=1 [1] received data[1]=2 [1] received data[2]=3 [1] received data[3]=4 [1] received data[4]=5 [1] received data[5]=6 [1] received data[6]=7 [1] received data[7]=8 [2] received data[0]=1 [2] received data[1]=2 [2] received data[2]=3 [2] received data[3]=4 [2] received data[4]=5 [2] received data[5]=6 [2] received data[6]=7 [2] received data[7]=8 [3] received data[1]=2 [3] received data[2]=3 [3] received data[3]=4 [3] received data[4]=5 [3] received data[5]=6 [3] received data[6]=7 [3] received data[7]=8 {20 sec. later...} mpirun: killing job...When I execute the same program with "-mca btl udapl,self" or "-mca btl tcp,self", it runs fine and terminates in less than a second. Tried with Open MPI 1.2.1 and 1.2.3. The test program runs fine with several other MPIs (intel-mpi and mvapich with InfinniBand, mp-mpich with SCI).
I hope, my information suffices to reproduce the problem. Best regards, Georg Wassen.ps. I know that I could transmit the array in one MPI_Send, but this is extracted from my real problem.
--------------------1st node----------------------- wassen@pd-01:~$ /opt/infiniband/bin/ibv_devinfo hca_id: mthca0 fw_ver: 1.2.0 node_guid: 0002:c902:0020:b680 sys_image_guid: 0002:c902:0020:b683 vendor_id: 0x02c9 vendor_part_id: 25204 hw_ver: 0xA0 board_id: MT_0230000001 phys_port_cnt: 1 port: 1 state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 1 port_lid: 1 port_lmc: 0x00 --------------------------------------------------------- wassen@pd-01:~$ /sbin/ifconfig ...ib0 Protokoll:UNSPEC Hardware Adresse 00-00-04-04-FE-80-00-00-00-00-00-00-00-00-00-00 inet Adresse:192.168.0.11 Bcast:192.168.0.255 Maske:255.255.255.0 inet6 Adresse: fe80::202:c902:20:b681/64 Gültigkeitsbereich:Verbindung
UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1 RX packets:260 errors:0 dropped:0 overruns:0 frame:0 TX packets:331 errors:0 dropped:2 overruns:0 carrier:0 Kollisionen:0 Sendewarteschlangenlänge:128 RX bytes:14356 (14.0 KiB) TX bytes:24960 (24.3 KiB) -------------------------------------------------------
#include "mpi.h" #include <stdio.h> #define NUM 16 int main(int argc, char **argv) { int myrank, count; MPI_Status status; int data[NUM] = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16}; int i, j; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); MPI_Comm_size(MPI_COMM_WORLD, &count); if (myrank == 0) { for (i=1; i<count; i++) { for (j=0; j<NUM; j++) { MPI_Send(&data[j], 1, MPI_INT, i, 99, MPI_COMM_WORLD); } } } else { for (j=0; j<NUM; j++) { MPI_Recv(&data[j], 1, MPI_INT, 0, 99, MPI_COMM_WORLD, &status); printf("[%d] received data[%d]=%d\n", myrank, j, data[j]); } } MPI_Finalize(); }
config.log.gz
Description: GNU Zip compressed data
ompi_info_all.txt.gz
Description: GNU Zip compressed data