FWIW, I'm immediately suspicious of *any* MPI application that uses the MPI 
one-sided operations (i.e., MPI_PUT and MPI_GET).  It looks like these two OSU 
benchmarks are using those operations.

Is it known that these two benchmarks are correct?



On Feb 29, 2012, at 11:33 AM, Venkateswara Rao Dokku wrote:

> Sorry, i forgot to introduce the system.. Ours is the customized OFED stack 
> implemented to work on the specific hardware.. We tested the stack with the 
> q-perf and Intel Benchmarks(IMB-3.2.2).. they went fine.. We want to execute 
> the osu_benchamark3.1.1 suite on our OFED..
> 
> On Wed, Feb 29, 2012 at 9:57 PM, Venkateswara Rao Dokku <dvrao....@gmail.com> 
> wrote:
> Hiii,
> I tried executing osu_benchamarks-3.1.1 suite with the openmpi-1.4.3... I 
> could run 10 bench-mark tests (except osu_put_bibw,osu_put_bw,osu_
> get_bw,osu_latency_mt) out of 14 tests in the bench-mark suite... and the 
> remaining tests are hanging at some message size.. the output is shown below
> 
> [root@test2 ~]# mpirun --prefix /usr/local/ -np 2 --mca btl openib,self,sm -H 
> 192.168.0.175,192.168.0.174 --mca orte_base_help_aggregate 0 
> /root/ramu/ofed_pkgs/osu_benchmarks-3.1.1/osu_put_bibw
> failed to create doorbell file /dev/plx2_char_dev 
> --------------------------------------------------------------------------
> WARNING: No preset parameters were found for the device that Open MPI
> detected:
> 
>   Local host:            test1
>   Device name:           plx2_0
>   Device vendor ID:      0x10b5
>   Device vendor part ID: 4277
> 
> Default device parameters will be used, which may result in lower
> performance.  You can edit any of the files specified by the
> btl_openib_device_param_files MCA parameter to set values for your
> device.
> 
> NOTE: You can turn off this warning by setting the MCA parameter
>       btl_openib_warn_no_device_params_found to 0.
> --------------------------------------------------------------------------
> failed to create doorbell file /dev/plx2_char_dev 
> --------------------------------------------------------------------------
> WARNING: No preset parameters were found for the device that Open MPI
> detected:
> 
>   Local host:            test2
>   Device name:           plx2_0
>   Device vendor ID:      0x10b5
>   Device vendor part ID: 4277
> 
> Default device parameters will be used, which may result in lower
> performance.  You can edit any of the files specified by the
> btl_openib_device_param_files MCA parameter to set values for your
> device.
> 
> NOTE: You can turn off this warning by setting the MCA parameter
>       btl_openib_warn_no_device_params_found to 0.
> --------------------------------------------------------------------------
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> # OSU One Sided MPI_Put Bi-directional Bandwidth Test v3.1.1
> # Size     Bi-Bandwidth (MB/s)
> plx2_create_qp line: 415 
> plx2_create_qp line: 415 
> plx2_create_qp line: 415 
> plx2_create_qp line: 415 
> 1                         0.00
> 2                         0.00
> 4                         0.01
> 8                         0.03
> 16                        0.07
> 32                        0.15
> 64                        0.11
> 128                       0.21
> 256                       0.43
> 512                       0.88
> 1024                      2.10
> 2048                      4.21
> 4096                      8.10
> 8192                     16.19
> 16384                     8.46
> 32768                    20.34
> 65536                    39.85
> 131072                   84.22
> 262144                  142.23
> 524288                  234.83
> mpirun: killing job...
> 
> --------------------------------------------------------------------------
> mpirun noticed that process rank 0 with PID 7305 on node test2 exited on 
> signal 0 (Unknown signal 0).
> --------------------------------------------------------------------------
> 2 total processes killed (some possibly by mpirun during cleanup)
> mpirun: clean termination accomplished
> 
> [root@test2 ~]# mpirun --prefix /usr/local/ -np 2 --mca btl openib,self,sm -H 
> 192.168.0.175,192.168.0.174 --mca orte_base_help_aggregate 0 
> /root/ramu/ofed_pkgs/osu_benchmarks-3.1.1/osu_put_bw
> failed to create doorbell file /dev/plx2_char_dev 
> --------------------------------------------------------------------------
> WARNING: No preset parameters were found for the device that Open MPI
> detected:
> 
>   Local host:            test1
>   Device name:           plx2_0
>   Device vendor ID:      0x10b5
>   Device vendor part ID: 4277
> 
> Default device parameters will be used, which may result in lower
> performance.  You can edit any of the files specified by the
> btl_openib_device_param_files MCA parameter to set values for your
> device.
> 
> NOTE: You can turn off this warning by setting the MCA parameter
>       btl_openib_warn_no_device_params_found to 0.
> --------------------------------------------------------------------------
> failed to create doorbell file /dev/plx2_char_dev 
> --------------------------------------------------------------------------
> WARNING: No preset parameters were found for the device that Open MPI
> detected:
> 
>   Local host:            test2
>   Device name:           plx2_0
>   Device vendor ID:      0x10b5
>   Device vendor part ID: 4277
> 
> Default device parameters will be used, which may result in lower
> performance.  You can edit any of the files specified by the
> btl_openib_device_param_files MCA parameter to set values for your
> device.
> 
> NOTE: You can turn off this warning by setting the MCA parameter
>       btl_openib_warn_no_device_params_found to 0.
> --------------------------------------------------------------------------
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> # OSU One Sided MPI_Put Bandwidth Test v3.1.1
> # Size        Bandwidth (MB/s)
> plx2_create_qp line: 415 
> plx2_create_qp line: 415 
> plx2_create_qp line: 415 
> plx2_create_qp line: 415 
> 1                         0.02
> 2                         0.05
> 4                         0.10
> 8                         0.19
> 16                        0.39
> 32                        0.77
> 64                        1.53
> 128                       2.57
> 256                       4.16
> 512                       8.30
> 1024                     16.62
> 2048                     33.22
> 4096                     66.51
> 8192                     42.45
> 16384                    11.99
> 32768                    18.20
> 65536                    76.04
> 131072                   98.64
> 262144                  407.66
> 524288                  489.84
> mpirun: killing job...
> 
> --------------------------------------------------------------------------
> mpirun noticed that process rank 0 with PID 7314 on node test2 exited on 
> signal 0 (Unknown signal 0).
> --------------------------------------------------------------------------
> 2 total processes killed (some possibly by mpirun during cleanup)
> mpirun: clean termination accomplished
> 
> I even checked the logs but i couldn't see any errors... 
> Could you suggest a way to overcome/debug this issue..
> 
> Thanks for the kind reply..
> 
> 
> -- 
> Thanks & Regards,
> D.Venkateswara Rao,
> Software Engineer,One Convergence Devices Pvt Ltd.,
> Jubille Hills,Hyderabad.
> 
> 
> 
> 
> -- 
> Thanks & Regards,
> D.Venkateswara Rao,
> Software Engineer,One Convergence Devices Pvt Ltd.,
> Jubille Hills,Hyderabad.
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to