El Lunes 17 Mayo 2010, Scott Atchley escribió: > On May 16, 2010, at 1:32 PM, Lydia Heck wrote: > > When running over gigabit using -mca btl tcp,self,sm the code runs > > alright, which is good as the largest part of our cluster is over > > gigabit, and as Gadget-3 scales rather well, the penalty for running over > > gigabit is not prohibitive. We also have a myrinet cluster and on there > > larger runs freeze. However as the gigabit cluster was available we have > > not really investigated this until just now. > > Hi Lydia, > > I can't help with the IB issue, but I am interested in the issue running > over MX. > > I found a ticket from 2007 regarding Gadget-2. The last set of mails > indicated that the app was running. You have had a few tickets since, but > none mentioned Gadget. Can you give me more details about the hang that > you experienced? > > I have a couple of ideas that we could investigate (one in Open-MPI and the > other in MX). > > Scott > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > Hello
To add a bit more of noise, I gave up with Gadget2 with openmpi-gm, it always gets frozen, and it may happen after days of integration, I was not able to get a clear trend. Now it is working well with mpich-gm (thanks to to the nice myricom folks ), meaning that at least I don't have any hardware problem. Regards -- Jaime D. Perea Duarte. <jaime at iaa dot es> Linux registered user #10472 Dep. Astrofisica Extragalactica. Instituto de Astrofisica de Andalucia (CSIC) Apdo. 3004, 18080 Granada, Spain.