Hello all, Open MPI is clever and use by default multiple IB adapters, if available. http://www.open-mpi.org/faq/?category=openfabrics#ofa-port-wireup
Open MPI is lazy and establish connections only iff needed. Both is good.We have kinda special nodes: up to 16 sockets, 128 cores, 4 boards, 4 IB cards. Multirail works!
The crucial thing is, that starting with v1.6.1 the latency of the very first PingPong sample between two nodes take really a lot of time - some 100x - 200x of usual latency. You cannot see this using usual latency benchmark(*) because they tend to omit the first samples as "warmup phase", but we use a kinda self-written parallel test which clearly show this (and let me to muse some days). If Miltirail is forbidden (-mca btl_openib_max_btls 1), or if v.1.5.3 used, or if the MPI processes are preconnected (http://www.open-mpi.org/faq/?category=running#mpi-preconnect) there is no such huge latency outliers for the first sample.
Well, we know about the warm-up and lazy connections. But 200x ?! Any comments about that is OK so? Best, Paul Kapinos (*) E.g. HPCC explicitely say in http://icl.cs.utk.edu/hpcc/faq/index.html#132 > Additional startup latencies are masked out by starting the measurement after > one non-measured ping-pong.P.S. Sorry for cross-posting to both Users and Developers, but my last questions to Users have no reply until yet, so trying to broadcast...
-- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915
smime.p7s
Description: S/MIME Cryptographic Signature