to Mellanox? I've attached some files with more
detailed information on this problem.
Dave Turner
--
Work: davetur...@ksu.edu (785) 532-7791
118 Nichols Hall, Manhattan KS 66502
Home:drdavetur...@gmail.com
cell: (785) 770-5929
mlx4_error.tar.gz
The Mellanox 2.33.5100 firmware upgrade that came out a few days ago
did indeed fix the
problem we were seeing with the mlx4 errors. Thanks for pointing us in
that direction.
Dave Turner
On Thu, Jan 29, 2015 at 11:00 AM, <devel-requ...@open-mpi.org> wrote:
messages.
However, I do think these issues will come up more in the future.
With the low latency of RoCE matching IB, there are more opportunities
to do channel bonding or allowing multiple interfaces for aggregate traffic
for even smaller message sizes.
Dave Turner
--
Work
4
> btl_tcp_bandwidth = 1
>
> make more sense based on your HPC system description.
>
> George.
>
>
>
>
> On Fri, Feb 6, 2015 at 5:37 PM, Dave Turner <drdavetur...@gmail.com>
> wrote:
>
>>
>> We have nodes in our HPC system that have
> should be 327680 and 81920 because of the 8/10 encoding
> (And that being said, that should not change the measured performance)
>
> Also, could you try again by forcing the same btl_tcp_latency and
> btl_openib_latency ?
>
> Cheers,
>
> Gilles
>
> Dave Turn
esirable bias on the load-balance
> between multiple devices logic (the bandwidth part).
>
> I just pushed a fix in master
>
> https://github.com/open-mpi/ompi/commit/e173f9b0c0c63c3ea24b8d8bc0ebafe1f1736acb
> .
> Once validated this should be moved over the 1.8 branch.
>
> Dave
sity, what kind of performance do you get when you use
> MXM? (e.g., the yalla PML on master)
>
>
> > On Feb 19, 2015, at 6:41 PM, Dave Turner <drdavetur...@gmail.com> wrote:
> >
> >
> > I've downloaded the OpenMPI master as suggested and rerun all my
>
mpif.h include file. This looks to be a bug to me, but please let
me know if I missed a config flag somewhere.
Dave Turner
Selene cat bugtest.F
! Program to illustrate bug when OpenMPI is compiled with Intel
!compilers but run using OMPI_FC=gfortran.
PROGRAM BUGTEST
e use a different Fortran
> ! compiler to build Open MPI.
>
> intel fortran compilers have the right stuff, so mpif-sizeof.h is usable,
> and you get something very different.
>
> Cheers,
>
> Gilles
>
>
> On 3/4/2016 10:17 AM, Dave Turner wrote:
>
>
>
the compilers, not because of OpenMPI.
>
> Larry Baker
> US Geological Survey
> 650-329-5608
> ba...@usgs.gov
>
>
>
> On 3 Mar 2016, at 6:39 PM, Dave Turner wrote:
>
> Gilles,
>
> I don't see the point of having the OMPI_CC and OMPI_FC environment
> vari
transparent to our users, and allows us to present a single
> build tree that works for both compilers.
>
>
>
> Cheers,
>
> Ben
>
>
>
>
>
>
>
> *From:* devel [mailto:devel-boun...@open-mpi.org] *On Behalf Of *Dave
> Turner
> *Sent:* Friday, 4 M
reset of the command line
> args)
>
> and see if it then works?
>
> Howard
>
>
> 2017-01-04 16:37 GMT-07:00 Dave Turner <drdavetur...@gmail.com>:
>
>>
>> --
>> No OpenFabrics connection
024,128,32:S,65536,1024,128,32 (all the reset of the command line
> args)
>
> and see if it then works?
>
> Howard
>
>
> 2017-01-04 16:37 GMT-07:00 Dave Turner <drdavetur...@gmail.com>:
>
>>
>>
that the --nocache measurements represent, I could certainly see large
bioinformatics runs being affected as the message lengths are not
going to be factors of 8 bytes.
Dave Turner
--
Work: davetur...@ksu.edu (785) 532-7791
2219 Engineering Hall, Manhattan KS 66506
l openib,self --mca
> btl_openib_get_limit $((1024*1024)) --mca btl_openib_put_limit
> $((1024*1024)) ./NPmpi --nocache --start 100
>
> George.
>
>
>
> On Wed, May 3, 2017 at 4:27 PM, Dave Turner <drdavetur...@gmail.com>
> wrote:
>
>> George,
>>
ooks pretty odd, and I'll have a look at it.
>
> Which benchmark are you using to measure the bandwidth ?
> Does your benchmark MPI_Init_thread(MPI_THREAD_MULTIPLE) ?
> Have you tried without --enable-mpi-thread-multiple ?
>
> Cheers,
>
> Gilles
>
> On Wed, Jan
-dlopen with --disable-mca-dso showed good performance.
Replacing --disable-dlopen with --enable-static showed good performance.
So it's only --disable-dlopen that leads to poor performance.
http://netpipe.cs.ksu.edu
Dave Turner
--
Work: davetur...@ksu.edu (785) 532
tests I
can run.
Dave Turner
CentOS 7 on Intel processors, QDR IB and 40 GbE tests
UCX 1.5.0 installed from the tarball according to the docs on the webpage
OpenMPI-4.0.1 configured for verbs with:
./configure F77=ifort FC=ifort
--prefix=/homes/daveturner/libs/openmpi-4.0.1
I've rerun my NetPIPE tests using --mca btl ^uct as Yossi suggested
and that
does indeed get rid of the message failures. I don't see any difference in
performance but wanted to check if there is any downside to doing the build
without uct as suggested.
Dave Turner
19 matches
Mail list logo