Re: [OMPI devel] KNEM + user-space hybrid for sm BTL

Aurélien Bouteiller Thu, 18 Jul 2013 11:52:43 -0400

Le 18 juil. 2013 à 11:12, "Iliev, Hristo" <[email protected]> a écrit :


> Hello,
>  
> Could someone, who is more familiar with the architecture of the sm BTL, 
> comment on the technical feasibility of the following: is it possible to 
> easily extend the BTL (i.e. without having to rewrite it completely from 
> scratch) so as to be able to perform transfers using both KNEM (or other 
> kernel-assisted copying mechanism) for messages over a given size and the 
> normal user-space mechanism for smaller messages with the switch-over point 
> being a user-tunable parameter?
>  
> From what I’ve seen, both implementations have something in common, e.g. both 
> use FIFOs to communicate controlling information.
> The motivation behind this are our efforts to become greener by extracting 
> the best possible out of the box performance on our systems without having to 
> profile each and every user application that runs on them. We’ve already 
> determined that activating KNEM really benefits some collective operations on 
> big shared-memory systems, but the increased latency significantly slows down 
> small message transfers, which also hits the pipelined implementations.
>  


Hristo, 

The knem BTL currently available in the trunk does just this :) You can use 
either Knem or Linux CMA to accelerate interprocess transfers. You can use the 
following mca parameters to turn on knem mode: 

-mca btl_sm_use_knem 1

If my memory serves me well, anything under eager limit is sent by regular 
double copy: 

-mca btl_sm_eager_limit 4096 (is the default, so anything below 1 page is 
copy-in, copy-out). If I remember correctly, anything below 16k decreased 
performance. 



We also have a collective component leveraging on knem capabilities. If you 
want more info about the details,
you can look at the following paper we published at IPDPS last year. It covers 
what we found to be the best cutoff values for using (or not) knem in several 
collective. 

Teng Ma, George Bosilca, Aurelien Bouteiller, Jack Dongarra, "HierKNEM: An 
Adaptive Framework for Kernel-Assisted and Topology-Aware Collective 
Communications on Many-core Clusters," Parallel and Distributed Processing 
Symposium, International, pp. 970-982, 2012 IEEE 26th International Parallel 
and Distributed Processing Symposium, 2012 

http://www.computer.org/csdl/proceedings/ipdps/2012/4675/00/4675a970-abs.html


Enjoy, 
Aurelien 



> sm’s code doesn’t seem to be very complex but still I’ve decided to ask first 
> before diving any deeper.
>  
> Kind regards,
> Hristo
> --
> Hristo Iliev, PhD – High Performance Computing Team
> RWTH Aachen University, Center for Computing and Communication
> Rechen- und Kommunikationszentrum der RWTH Aachen
> Seffenter Weg 23, D 52074 Aachen (Germany)
>  
>  
> _______________________________________________
> devel mailing list
> [email protected]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
* Dr. Aurélien Bouteiller
* Researcher at Innovative Computing Laboratory
* University of Tennessee
* 1122 Volunteer Boulevard, suite 309b
* Knoxville, TN 37996
* 865 974 9375

Re: [OMPI devel] KNEM + user-space hybrid for sm BTL

Reply via email to