subject:"\[OMPI users\] Running OpenMPI on SGI Altix with 4096 cores \: very poor performance"

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2011-01-07 Thread Eugene Loh

Gilbert Grosdidier wrote: Any other suggestion ? Can any more information be extracted from profiling? Here is where I think things left off: Eugene Loh wrote: Gilbert Grosdidier wrote: # [time] [calls] <%mpi> <%wall> # MPI_Waitall

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2011-01-07 Thread Gilbert Grosdidier

Unfortunately, I was unable to spot any striking difference in perfs when using --bind-to-core. Sorry. Any other suggestion ? Regards,Gilbert. Le 7 janv. 11 à 16:32, Jeff Squyres a écrit : Well, bummer -- there goes my theory. According to the hwloc info you posted earlier, this s

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2011-01-07 Thread Gilbert Grosdidier

I'll very soon give a try to using Hyperthreading with our app, and keep you posted about the improvements, if any. Our current cluster is made out of 4-core dual-socket Nehalem nodes. Cheers,Gilbert. Le 7 janv. 11 à 16:17, Tim Prince a écrit : On 1/7/2011 6:49 AM, Jeff Squyres wrote:

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2011-01-07 Thread Jeff Squyres

Well, bummer -- there goes my theory. According to the hwloc info you posted earlier, this shows that OMPI is binding to the 1st hyperthread on each core; *not* to both hyperthreads on a single core. :-\ It would still be slightly interesting to see if there's any difference when you run with

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2011-01-07 Thread Tim Prince

On 1/7/2011 6:49 AM, Jeff Squyres wrote: My understanding is that hyperthreading can only be activated/deactivated at boot time -- once the core resources are allocated to hyperthreads, they can't be changed while running. Whether disabling the hyperthreads or simply telling Linux not to sche

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2011-01-07 Thread Gilbert Grosdidier

Yes, here it is : > mpirun -np 8 --mca mpi_paffinity_alone 1 /opt/software/SGI/hwloc/ 1.1rc6r3028/bin/hwloc-bind --get 0x0001 0x0002 0x0004 0x0008 0x0010 0x0020 0x0040 0x0080 Gilbert. Le 7 janv. 11 à 15:50, Jeff Squyres a écrit : Can you run with np=8? On J

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2011-01-07 Thread Jeff Squyres

Can you run with np=8? On Jan 7, 2011, at 9:49 AM, Gilbert Grosdidier wrote: > Hi Jeff, > > Thanks for taking care of this. > > Here is what I got on a worker node: > > > mpirun --mca mpi_paffinity_alone 1 > > /opt/software/SGI/hwloc/1.1rc6r3028/bin/hwloc-bind --get > 0x0001 > > Is thi

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2011-01-07 Thread Jeff Squyres

On Jan 7, 2011, at 5:27 AM, John Hearns wrote: > Actually, the topic of hyperthreading is interesting, and we should > discuss it please. > Hyperthreading is supposedly implemented better and 'properly' on > Nehalem - I would be interested to see some genuine > performance measurements with hypert

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2011-01-07 Thread Gilbert Grosdidier

Hi Jeff, Thanks for taking care of this. Here is what I got on a worker node: > mpirun --mca mpi_paffinity_alone 1 /opt/software/SGI/hwloc/ 1.1rc6r3028/bin/hwloc-bind --get 0x0001 Is this what is expected, please ? Or should I try yet another command ? Thanks, Regards, Gilbert

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2011-01-07 Thread Jeff Squyres

On Jan 6, 2011, at 11:23 PM, Gilbert Grosdidier wrote: > > lstopo > Machine (35GB) > NUMANode L#0 (P#0 18GB) + Socket L#0 + L3 L#0 (8192KB) >L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0 > PU L#0 (P#0) > PU L#1 (P#8) >L2 L#1 (256KB) + L1 L#1 (32KB) + Core L#1 > PU L#2 (P#1) >

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2011-01-07 Thread John Hearns

On 6 January 2011 21:10, Gilbert Grosdidier wrote: > Hi Jeff, > > Where's located lstopo command on SuseLinux, please ? > And/or hwloc-bind, which seems related to it ? I was able to get hwloc to install quite easily on SuSE - download/configure/make Configure it to install to /usr/local/bin A

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2011-01-06 Thread Gilbert Grosdidier

Hi Jeff, Here is the output of lstopo on one of the workers (thanks Jean-Christophe) : > lstopo Machine (35GB) NUMANode L#0 (P#0 18GB) + Socket L#0 + L3 L#0 (8192KB) L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0 PU L#0 (P#0) PU L#1 (P#8) L2 L#1 (256KB) + L1 L#1 (32KB) + Core

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2011-01-06 Thread Jeff Squyres

On Jan 6, 2011, at 5:07 PM, Gilbert Grosdidier wrote: > Yes Jeff, I'm pretty sure indeed that hyperthreading is enabled, since 16 > CPUs are visible in the /proc/cpuinfo pseudo-file, while it's a 8 core > Nehalem node. > > However, I always carefully checked that only 8 processes are running o

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2011-01-06 Thread Gilbert Grosdidier

Yes Jeff, I'm pretty sure indeed that hyperthreading is enabled, since 16 CPUs are visible in the /proc/cpuinfo pseudo-file, while it's a 8 core Nehalem node. However, I always carefully checked that only 8 processes are running on each node. Could it be that they are assigned to 8 hyperthre

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2011-01-06 Thread Jeff Squyres

On Jan 6, 2011, at 4:10 PM, Gilbert Grosdidier wrote: > Where's located lstopo command on SuseLinux, please ? 'fraid I don't know anything about Suse... :-( It may be named hwloc-ls...? > And/or hwloc-bind, which seems related to it ? hwloc-bind is definitely related, but it's a different uti

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2011-01-06 Thread Gilbert Grosdidier

Hi Jeff, Where's located lstopo command on SuseLinux, please ? And/or hwloc-bind, which seems related to it ? Thanks, G. Le 06/01/2011 21:21, Jeff Squyres a écrit : (now that we're back from vacation) Actually, this could be an issue. Is hyperthreading enabled on your machine? Can yo

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2011-01-06 Thread Jeff Squyres

(now that we're back from vacation) Actually, this could be an issue. Is hyperthreading enabled on your machine? Can you send the text output from running hwloc's "lstopo" command on your compute nodes? I ask because if hyperthreading is enabled, OMPI might be assigning one process per *hyert

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2010-12-22 Thread Gilbert Grosdidier

Hi David, Yes, I set mpi_affinity_alone to 1. Is that right and sufficient, please ? Thanks for your help, Best, G. Le 22/12/2010 20:18, David Singleton a écrit : Is the same level of processes and memory affinity or binding being used? On 12/21/2010 07:45 AM, Gilbert Grosdidier wrot

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2010-12-22 Thread David Singleton

Is the same level of processes and memory affinity or binding being used? On 12/21/2010 07:45 AM, Gilbert Grosdidier wrote: Yes, there is definitely only 1 process per core with both MPI implementations. Thanks, G. Le 20/12/2010 20:39, George Bosilca a écrit : Are your processes places the

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores: very poor performance

2010-12-22 Thread Eugene Loh

Gilbert Grosdidier wrote: Bonsoir Eugene, Bon matin chez moi. Here follows some output for a 1024 core run. Assuming this corresponds meaningfully with your original e-mail, 1024 cores means performance of 700 vs 900. So, that looks roughly consistent with the 28% MPI time you show here

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores: very poor performance

2010-12-22 Thread Gilbert Grosdidier

Bonsoir Eugene, First thanks for trying to help me. I already gave a try to some profiling tool, namely IPM, which is rather simple to use. Here follows some output for a 1024 core run. Unfortunately, I'm yet unable to have the equivalent MPT chart. #IPMv0.983#

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores: very poor performance

2010-12-22 Thread Eugene Loh

Can you isolate a bit more where the time is being spent? The performance effect you're describing appears to be drastic. Have you profiled the code? Some choices of tools can be found in the FAQ http://www.open-mpi.org/faq/?category=perftools The results may be "uninteresting" (all time sp

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores: very poor performance

2010-12-22 Thread Gilbert Grosdidier

There are indeed a high rate of communications. But the buffer size is always the same for a given pair of processes, and I thought that mpi_leave_pinned should avoid freeing the memory in this case. Am I wrong ? Thanks, Best, G. Le 21/12/2010 18:52, Matthieu Brucher a écrit : Don't forg

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores: very poor performance

2010-12-21 Thread Matthieu Brucher

Don't forget that MPT has some optimizations OpenMPI may not have, as "overriding" free(). This way, MPT can have a huge performance boost if you're allocating and freeing memory, and the same happens if you communicate often. Matthieu 2010/12/21 Gilbert Grosdidier : > Hi George, > Thanks for yo

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores: very poor performance

2010-12-21 Thread Gilbert Grosdidier

Hi George, Thanks for your help. The bottom line is that the processes are neatly placed on the nodes/cores, as far as I can tell from the map : [...] Process OMPI jobid: [33285,1] Process rank: 4 Process OMPI jobid: [33285,1] Process rank: 5 Process OMPI jobid: [3328

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2010-12-20 Thread George Bosilca

That's a first step. My question was more related to the process overlay on the cores. If the MPI implementation place one process per node, then rank k and rank k+1 will always be on separate node, and the communications will have to go over IB. In the opposite if the MPI implementation places

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2010-12-20 Thread Gilbert Grosdidier

Yes, there is definitely only 1 process per core with both MPI implementations. Thanks, G. Le 20/12/2010 20:39, George Bosilca a écrit : Are your processes places the same way with the two MPI implementations? Per-node vs. per-core ? george. On Dec 20, 2010, at 11:14 , Gilbert Grosdi

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2010-12-20 Thread George Bosilca

Are your processes places the same way with the two MPI implementations? Per-node vs. per-core ? george. On Dec 20, 2010, at 11:14 , Gilbert Grosdidier wrote: > Bonjour, > > I am now at a loss with my running of OpenMPI (namely 1.4.3) > on a SGI Altix cluster with 2048 or 4096 cores, runnin

[OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2010-12-20 Thread Gilbert Grosdidier

Bonjour, I am now at a loss with my running of OpenMPI (namely 1.4.3) on a SGI Altix cluster with 2048 or 4096 cores, running over Infiniband. After fixing several rather obvious failures with Ralph, Jeff and John help, I am now facing the bottom of this story since : - there are no more obv

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores: very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores: very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores: very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores: very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores: very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores: very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

[OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

29 matches

Site Navigation

Mail list logo

Footer information