On 4/6/2010 2:54 PM, Jeff Squyres wrote: > Sorry for the delay -- I just replied on the user list -- I think the first > thing to do is to establish baseline networking performance and see if that > is out of whack. If the underlying network is bad, then MPI performance will > also be bad. >
Could make sense. With kernel 2.6.24 it seems a major change in the modules for Intel PCI-Express network cards was introduced. Does openmpi use TCP communication, even if all processes are on the same local node? > > On Apr 6, 2010, at 11:51 AM, Oliver Geisler wrote: > >> On 4/6/2010 10:11 AM, Rainer Keller wrote: >>> Hello Oliver, >>> Hmm, this is really a teaser... >>> I haven't seen such a drastic behavior, and haven't read of any on the list. >>> >>> One thing however, that might interfere is process binding. >>> Could You make sure, that processes are not bound to cores (default in >>> 1.4.1): >>> with mpirun --bind-to-none >>> >> >> I have tried version 1.4.1. Using default settings and watched processes >> switching from core to core in "top" (with "f" + "j"). Then I tried >> --bind-to-core and explicitly --bind-to-none. All with the same result: >> ~20% cpu wait and lot longer over-all computation times. >> >> Thanks for the idea ... >> Every input is helpful. >> >> Oli >> >> >>> Just an idea... >>> >>> Regards, >>> Rainer >>> >>> On Tuesday 06 April 2010 10:07:35 am Oliver Geisler wrote: >>>> Hello Devel-List, >>>> >>>> I am a little bit helpless about this matter. I already posted in the >>>> user list. In case you don't read the users list, I post in here. >>>> >>>> This is the original posting: >>>> >>>> http://www.open-mpi.org/community/lists/users/2010/03/12474.php >>>> >>>> Short: >>>> Switching from kernel 2.6.23 to 2.6.24 (and up), using openmpi 1.2.7-rc2 >>>> (I know outdated, but in debian stable, and same results with 1.4.1) >>>> increases communication times between processes (essentially between one >>>> master and several slave processes). This is regardless of whether the >>>> processes are local only or communication is over ethernet. >>>> >>>> Did anybody witness such a behavior? >>>> >>>> Ideas what should I test for? >>>> >>>> What additional information should I provide for you? >>>> >>>> Thanks for your time >>>> >>>> oli >>>> >>> >> >> >> -- >> This message has been scanned for viruses and >> dangerous content by MailScanner, and is >> believed to be clean. >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > > -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.