Folks,

Thanks for your help and prompt replies. We appreciate all the support we get 
from the community.

S.

-- 
Si Hammond
Scalable Computer Architectures
Sandia National Laboratories, NM, USA
 

On 12/4/18, 6:57 PM, "users on behalf of Gilles Gouaillardet" 
<users-boun...@lists.open-mpi.org on behalf of gil...@rist.or.jp> wrote:

    Thanks for the report.
    
    
    As far as I am concerned, this is a bug in the IMB benchmark, and I 
    issued a PR to fix that
    
    https://github.com/intel/mpi-benchmarks/pull/11
    
    
    Meanwhile, you can manually download and apply the patch at
    
    https://github.com/intel/mpi-benchmarks/pull/11.patch
    
    
    
    Cheers,
    
    
    Gilles
    
    
    On 12/4/2018 4:41 AM, Hammond, Simon David via users wrote:
    > Hi Open MPI Users,
    >
    > Just wanted to report a bug we have seen with OpenMPI 3.1.3 and 4.0.0 
when using the Intel 2019 Update 1 compilers on our Skylake/OmniPath-1 cluster. 
The bug occurs when running the Github master src_c variant of the Intel MPI 
Benchmarks.
    >
    > Configuration:
    >
    > ./configure 
--prefix=/home/projects/x86-64-skylake/openmpi/3.1.3/intel/19.1.144 
--with-slurm --with-psm2 
CC=/home/projects/x86-64/intel/compilers/2019/compilers_and_libraries_2019.1.144/linux/bin/intel64/icc
 
CXX=/home/projects/x86-64/intel/compilers/2019/compilers_and_libraries_2019.1.144/linux/bin/intel64/icpc
 
FC=/home/projects/x86-64/intel/compilers/2019/compilers_and_libraries_2019.1.144/linux/bin/intel64/ifort
 --with-zlib=/home/projects/x86-64/zlib/1.2.11 
--with-valgrind=/home/projects/x86-64/valgrind/3.13.0
    >
    > Operating System is RedHat 7.4 release and we utilize a local build of 
GCC 7.2.0 for our Intel compiler (C++) header files. Everything makes 
correctly, and passes a make check without any issues.
    >
    > We then compile IMB and run IMB-MPI1 on 24 nodes and get the following:
    >
    > #----------------------------------------------------------------
    > # Benchmarking Reduce_scatter
    > # #processes = 64
    > # ( 1088 additional processes waiting in MPI_Barrier)
    > #----------------------------------------------------------------
    >         #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
    >              0         1000         0.18         0.19         0.18
    >              4         1000         7.39        10.37         8.68
    >              8         1000         7.84        11.14         9.23
    >             16         1000         8.50        12.37        10.14
    >             32         1000        10.37        14.66        12.15
    >             64         1000        13.76        18.82        16.17
    >            128         1000        21.63        27.61        24.87
    >            256         1000        39.98        47.27        43.96
    >            512         1000        72.93        78.59        75.15
    >           1024         1000       147.21       152.98       149.94
    >           2048         1000       413.41       426.90       420.15
    >           4096         1000       421.28       442.58       434.52
    >           8192         1000       418.31       450.20       438.51
    >          16384         1000      1082.85      1221.44      1140.92
    >          32768         1000      2434.11      2529.90      2476.72
    >          65536          640      5469.57      6048.60      5687.08
    >         131072          320     11702.94     12435.06     12075.07
    >         262144          160     19214.42     20433.83     19883.80
    >         524288           80     49462.22     53896.43     52101.56
    >        1048576           40    119422.53    131922.20    126920.99
    >        2097152           20    256345.97    288185.72    275767.05
    > [node06:351648] *** Process received signal ***
    > [node06:351648] Signal: Segmentation fault (11)
    > [node06:351648] Signal code: Invalid permissions (2)
    > [node06:351648] Failing at address: 0x7fdb6efc4000
    > [node06:351648] [ 0] /lib64/libpthread.so.0(+0xf5e0)[0x7fdb8646c5e0]
    > [node06:351648] [ 1] ./IMB-MPI1(__intel_avx_rep_memcpy+0x140)[0x415380]
    > [node06:351648] [ 2] 
/home/projects/x86-64-skylake/openmpi/3.1.3/intel/19.1.144/lib/libopen-pal.so.40(opal_datatype_copy_content_same_ddt+0xca)[0x7fdb858d847a]
    > [node06:351648] [ 3] 
/home/projects/x86-64-skylake/openmpi/3.1.3/intel/19.1.144/lib/libmpi.so.40(ompi_coll_base_reduce_scatter_intra_ring+0x3f9)[0x7fdb86c43b29]
    > [node06:351648] [ 4] 
/home/projects/x86-64-skylake/openmpi/3.1.3/intel/19.1.144/lib/libmpi.so.40(PMPI_Reduce_scatter+0x1d7)[0x7fdb86c1de67]
    > [node06:351648] [ 5] ./IMB-MPI1[0x40d624]
    > [node06:351648] [ 6] ./IMB-MPI1[0x407d16]
    > [node06:351648] [ 7] ./IMB-MPI1[0x403356]
    > [node06:351648] [ 8] 
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7fdb860bbc05]
    > [node06:351648] [ 9] ./IMB-MPI1[0x402da9]
    > [node06:351648] *** End of error message ***
    > [node06:351649] *** Process received signal ***
    > [node06:351649] Signal: Segmentation fault (11)
    > [node06:351649] Signal code: Invalid permissions (2)
    > [node06:351649] Failing at address: 0x7f9b19c6f000
    > [node06:351649] [ 0] /lib64/libpthread.so.0(+0xf5e0)[0x7f9b311295e0]
    > [node06:351649] [ 1] ./IMB-MPI1(__intel_avx_rep_memcpy+0x140)[0x415380]
    > [node06:351649] [ 2] 
/home/projects/x86-64-skylake/openmpi/3.1.3/intel/19.1.144/lib/libopen-pal.so.40(opal_datatype_copy_content_same_ddt+0xca)[0x7f9b3059547a]
    > [node06:351649] [ 3] 
/home/projects/x86-64-skylake/openmpi/3.1.3/intel/19.1.144/lib/libmpi.so.40(ompi_coll_base_reduce_scatter_intra_ring+0x3f9)[0x7f9b31900b29]
    > [node06:351649] [ 4] 
/home/projects/x86-64-skylake/openmpi/3.1.3/intel/19.1.144/lib/libmpi.so.40(PMPI_Reduce_scatter+0x1d7)[0x7f9b318dae67]
    > [node06:351649] [ 5] ./IMB-MPI1[0x40d624]
    > [node06:351649] [ 6] ./IMB-MPI1[0x407d16]
    > [node06:351649] [node06:351657] *** Process received signal ***
    > 
    >
    _______________________________________________
    users mailing list
    users@lists.open-mpi.org
    https://lists.open-mpi.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to