This definitely looked promising, but unfortunately it didn't work. I both added the appropriate export lines to my qsub file, and then when that didn't work I checked the mvapich.conf file and confirmed that the processor affinity was disabled. I wonder if I can turn it on and make it work, but unfortunately the cluster is full at the moment, so I can't test it.
-- KS -----Original Message----- From: Shannon V. Davidson [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 23, 2008 4:02 PM To: Schoenefeld, Keith Cc: [email protected] Subject: Re: [Beowulf] Strange SGE scheduling problem Schoenefeld, Keith wrote: > My cluster has 8 slots (cores)/node in the form of two quad-core > processors. Only recently we've started running jobs on it that require > 12 slots. We've noticed significant speed problems running multiple 12 > slot jobs, and quickly discovered that the node that was running 4 slots > on one job and 4 slots on another job was running both jobs on the same > processor cores (i.e. both job1 and job2 were running on CPU's #0-#3, > and the CPUs #4-#7 were left idling. The result is that the jobs were > competing for time on half the processors that were available. > > In addition, a 4 slot job started well after the 12 slot job has ramped > up results in the same problem (both the 12 slot job and the four slot > job get assigned to the same slots on a given node). > > Any insight as to what is occurring here and how I could prevent it from > happening? We were are using SGE + mvapich 1.0 and a PE that has the > $fill_up allocation rule. > > I have also posted this question to the [EMAIL PROTECTED] > mailing list, so my apologies for people who get this email multiple > times. > Any insight as to what is occurring here and how I could prevent it from > happening? We were are using SGE + mvapich 1.0 and a PE that has the > $fill_up allocation rule. > This sounds like MVAPICH is assigning your MPI tasks to your CPUs starting with CPU#0. If you are going to run multiple MVAPICH jobs on the same host, turn off CPU affinity by starting the MPI tasks with the environment variable VIADEV_USE_AFFINITY=0 and VIADEV_ENABLE_AFFINITY=0. Cheers, Shannon > Any help is appreciated. > > -- KS > > _______________________________________________ > Beowulf mailing list, [email protected] > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > > -- _________________________________________ Shannon V. Davidson <[EMAIL PROTECTED]> Software Engineer Appro International 636-633-0380 (office) 443-383-0331 (fax) _________________________________________ _______________________________________________ Beowulf mailing list, [email protected] To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
