Hello, we have executed a benchmark (SkaMPI) on the same machine (32 core Intel Xeon 86_64) with these two configurations: - 1 orted with 16 processes with BTL forced to TCP (--mca btl self,tcp) - 16 orted with each 1 process (that uses TCP)
We use a custom RAS to allow multiple orted on the same machine (I know that it seems non-sense to have multiple orteds on the same machine for the same application, but we are doing some experiments for migration). Initially we have expected approximately the same performance in both cases (we have 16 processes communicating via TCP in both cases), but we have a degradation of 50%, and we are sure that is not an overhead due to orteds initialization. Do you have any idea how can multiple orteds influence the processess performance? Cheers, Federico __ Federico Reghenzani M.Eng. Student @ Politecnico di Milano Computer Science and Engineering