I was making basic performance measurements on our machine after installing 
1.8.5, the performance were looking bad. It turns out that the smcuda btl has a 
higher exclusivity than both vader and sm, even on machines with no nvidia 
adapters. Is there a strong reason why the default exclusivity is set so high ? 
Of course it can be easily fixed with a couple of mca options, but unsuspecting 
users that “just run” will experience 1/3 overhead across the board for shared 
memory communication according to my measurements.


Side note: from my understanding of the smcuda component, performance should be 
identical to the regular sm component (as long as no GPU
operation are required). This is not the case, there is some performance 
penalty with smcuda compared to sm.

Aurelien

--
Aurélien Bouteiller ~~ https://icl.cs.utk.edu/~bouteill/

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to