Re: [hwloc-users] Strange binding issue on 40 core nodes and cgroups

2012-11-06 Thread Brock Palen
Chis,

If you assume your Cpusets are correct, and you are not doing any hybrid 
thread+mpi I found the problem is avoided if you enable -bind-to-core with 
openmpi 1.6.x  

We just don't enable binding by default on our setup and thus far no users have 
been bit by this. 

Brock Palen
www.umich.edu/~brockp
CAEN Advanced Computing
bro...@umich.edu
(734)936-1985



On Nov 5, 2012, at 9:00 PM, Christopher Samuel wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On 06/11/12 08:57, Brock Palen wrote:
> 
>> Ok more information (had to build newer hwloc)  My job today only
>> 2 processes are running at half speed and they indeed are sharing
>> the same core:
> 
> We've seen the same occasionally using CentOS5/RHEL5 with jobs running
> under Torque with cpusets enabled.
> 
> Never been able to explain it and the most recent case was someone
> using a home compiled version of NAMD, the problem disappeared when
> they started using our provided builds.
> 
> I was fixing up the running problem jobs by hand by assigning procs to
> individual cores on the nodes with cpusets.  :-/
> 
> cheers,
> Chris
> - -- 
> Christopher SamuelSenior Systems Administrator
> VLSCI - Victorian Life Sciences Computation Initiative
> Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
> http://www.vlsci.org.au/  http://twitter.com/vlsci
> 
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://www.enigmail.net/
> 
> iEYEARECAAYFAlCYb1sACgkQO2KABBYQAh/OGACeNL7bow7z26El31zIg16q+tPw
> toIAnigL5SHhZXM42DGY3M2Ewt6PUNIk
> =/bNA
> -END PGP SIGNATURE-
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users




Re: [hwloc-users] Strange binding issue on 40 core nodes and cgroups

2012-11-05 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 06/11/12 08:57, Brock Palen wrote:

> Ok more information (had to build newer hwloc)  My job today only
> 2 processes are running at half speed and they indeed are sharing
> the same core:

We've seen the same occasionally using CentOS5/RHEL5 with jobs running
under Torque with cpusets enabled.

Never been able to explain it and the most recent case was someone
using a home compiled version of NAMD, the problem disappeared when
they started using our provided builds.

I was fixing up the running problem jobs by hand by assigning procs to
individual cores on the nodes with cpusets.  :-/

cheers,
Chris
- -- 
 Christopher SamuelSenior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/  http://twitter.com/vlsci

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/

iEYEARECAAYFAlCYb1sACgkQO2KABBYQAh/OGACeNL7bow7z26El31zIg16q+tPw
toIAnigL5SHhZXM42DGY3M2Ewt6PUNIk
=/bNA
-END PGP SIGNATURE-


Re: [hwloc-users] Strange binding issue on 40 core nodes and cgroups

2012-11-05 Thread Samuel Thibault
Brice Goglin, le Mon 05 Nov 2012 23:23:42 +0100, a écrit :
> top can also sort by the last used CPU. Type f to enter the config menu,
> hilight the "last cpu" line, and hit 's' to make it the sort column.

With older versions of top, type F, then j, then space.

Samuel


Re: [hwloc-users] Strange binding issue on 40 core nodes and cgroups

2012-11-05 Thread Brice Goglin
Le 05/11/2012 22:57, Brock Palen a écrit :
> Ok more information (had to build newer hwloc)  My job today only 2 processes 
> are running at half speed and they indeed are sharing the same core:
>
> [root@nyx7000 ~]# for x in `cat /tmp/pids `; do echo -n "$x  "; hwloc-bind 
> --get-last-cpu-location --pid $x; done | sort -k 2
> 1164  0x0001,0x0
> 1158  0x0010,0x0
> 1165  0x0010,0x0
> 1167  0x0020
> 1157  0x0200
> 1159  0x0400
> 1160  0x2000
> 1163  0x4000
> 1166  0x0002
> 1161  0x0004
> 1168  0x0020
> 1162  0x0040
>
>  1157 brockp20   0 1885m 1.8g  456 R 99.6  0.2   9:49.55 stream   
> 
>  1159 brockp20   0 1885m 1.8g  456 R 99.6  0.2   8:10.91 stream   
> 
>  1161 brockp20   0 1885m 1.8g  456 R 99.6  0.2   9:49.55 stream   
> 
>  1162 brockp20   0 1885m 1.8g  456 R 99.6  0.2   9:49.54 stream   
> 
>  1163 brockp20   0 1885m 1.8g  456 R 99.6  0.2   9:49.55 stream   
> 
>  1164 brockp20   0 1885m 1.8g  456 R 99.6  0.2   9:49.53 stream   
> 
>  1160 brockp20   0 1885m 1.8g  456 R 97.7  0.2   9:49.54 stream   
> 
>  1166 brockp20   0 1885m 1.8g  456 R 97.7  0.2   9:49.53 stream   
> 
>  1167 brockp20   0 1885m 1.8g  456 R 97.7  0.2   9:49.46 stream   
> 
>  1168 brockp20   0 1885m 1.8g  456 R 97.7  0.2   8:10.86 stream   
> 
>  1158 brockp20   0 1885m 1.8g  456 R 48.9  0.2   4:54.78 stream   
> 
>  1165 brockp20   0 1885m 1.8g  456 R 48.9  0.2   4:54.76 stream   
> 
>
>
> This is very strange. Is there a way to ask hwloc to show me all processes 
> that are using a given cpu?
>

No there's no easy way to do that.
You should first check whether this given cpu is idle or not. Running
top and pressing 1 will show one line per CPU (yours should be the
second CPU line).

top can also sort by the last used CPU. Type f to enter the config menu,
hilight the "last cpu" line, and hit 's' to make it the sort column.
Assuming your top version isn't too different from mine, you should be
able to quickly see if any process used your given cpu recently.

Brice



Re: [hwloc-users] Strange binding issue on 40 core nodes and cgroups

2012-11-05 Thread Brock Palen
Ok more information (had to build newer hwloc)  My job today only 2 processes 
are running at half speed and they indeed are sharing the same core:

[root@nyx7000 ~]# for x in `cat /tmp/pids `; do echo -n "$x  "; hwloc-bind 
--get-last-cpu-location --pid $x; done | sort -k 2
1164  0x0001,0x0
1158  0x0010,0x0
1165  0x0010,0x0
1167  0x0020
1157  0x0200
1159  0x0400
1160  0x2000
1163  0x4000
1166  0x0002
1161  0x0004
1168  0x0020
1162  0x0040

 1157 brockp20   0 1885m 1.8g  456 R 99.6  0.2   9:49.55 stream 
  
 1159 brockp20   0 1885m 1.8g  456 R 99.6  0.2   8:10.91 stream 
  
 1161 brockp20   0 1885m 1.8g  456 R 99.6  0.2   9:49.55 stream 
  
 1162 brockp20   0 1885m 1.8g  456 R 99.6  0.2   9:49.54 stream 
  
 1163 brockp20   0 1885m 1.8g  456 R 99.6  0.2   9:49.55 stream 
  
 1164 brockp20   0 1885m 1.8g  456 R 99.6  0.2   9:49.53 stream 
  
 1160 brockp20   0 1885m 1.8g  456 R 97.7  0.2   9:49.54 stream 
  
 1166 brockp20   0 1885m 1.8g  456 R 97.7  0.2   9:49.53 stream 
  
 1167 brockp20   0 1885m 1.8g  456 R 97.7  0.2   9:49.46 stream 
  
 1168 brockp20   0 1885m 1.8g  456 R 97.7  0.2   8:10.86 stream 
  
 1158 brockp20   0 1885m 1.8g  456 R 48.9  0.2   4:54.78 stream 
  
 1165 brockp20   0 1885m 1.8g  456 R 48.9  0.2   4:54.76 stream 
  


This is very strange. Is there a way to ask hwloc to show me all processes that 
are using a given cpu?




Brock Palen
www.umich.edu/~brockp
CAEN Advanced Computing
bro...@umich.edu
(734)936-1985



On Nov 2, 2012, at 4:30 PM, Brice Goglin wrote:

> Le 02/11/2012 21:22, Brice Goglin a écrit :
>> hwloc-bind --get-last-cpu-location --pid  should give the same
>> info but it seems broken on my machine right now, going to debug.
> 
> Actually, that works fine once you try it on a non-multithreaded program
> that uses all cores :)
> 
> So you can use top or hwloc-bind --get-last-cpu-location --pid  to
> find out where each process runs.
> 
> Brice
> 
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users




Re: [hwloc-users] Strange binding issue on 40 core nodes and cgroups

2012-11-02 Thread Brice Goglin
Le 02/11/2012 21:22, Brice Goglin a écrit :
> hwloc-bind --get-last-cpu-location --pid  should give the same
> info but it seems broken on my machine right now, going to debug.

Actually, that works fine once you try it on a non-multithreaded program
that uses all cores :)

So you can use top or hwloc-bind --get-last-cpu-location --pid  to
find out where each process runs.

Brice



Re: [hwloc-users] Strange binding issue on 40 core nodes and cgroups

2012-11-02 Thread Brice Goglin
Le 02/11/2012 21:03, Brock Palen a écrit :
> This isn't a hwloc problem exactly, but maybe you can shed some insight.
>
> We have some 4 socket 10 core = 40 core nodes, HT off:
>
> depth 0:  1 Machine (type #1)
>  depth 1: 4 NUMANodes (type #2)
>   depth 2:4 Sockets (type #3)
>depth 3:   4 Caches (type #4)
> depth 4:  40 Caches (type #4)
>  depth 5: 40 Caches (type #4)
>   depth 6:40 Cores (type #5)
>depth 7:   40 PUs (type #6)
>
>
> We run rhel 6.3  we use torque to create cgroups for jobs.  I get the 
> following cgroup for this job  all 12 cores for the job are on one node:
> cat /dev/cpuset/torque/8845236.nyx.engin.umich.edu/cpus 
> 0-1,4-5,8,12,16,20,24,28,32,36
>
> Not all nicely spaced, but 12 cores
>
> I then start a code, even a simple serial code with openmpi 1.6.0 on all 12 
> cores:
> mpirun ./stream
>
> 45521 brockp20   0 1885m 1.8g  456 R 100.0  0.2   4:02.72 stream  
>
> 45522 brockp20   0 1885m 1.8g  456 R 100.0  0.2   1:46.08 stream  
>
> 45525 brockp20   0 1885m 1.8g  456 R 100.0  0.2   4:02.72 stream  
>
> 45526 brockp20   0 1885m 1.8g  456 R 100.0  0.2   1:46.07 stream  
>
> 45527 brockp20   0 1885m 1.8g  456 R 100.0  0.2   4:02.71 stream  
>
> 45528 brockp20   0 1885m 1.8g  456 R 100.0  0.2   4:02.71 stream  
>
> 45532 brockp20   0 1885m 1.8g  456 R 100.0  0.2   1:46.05 stream  
>
> 45529 brockp20   0 1885m 1.8g  456 R 99.2  0.2   4:02.70 stream   
>
> 45530 brockp20   0 1885m 1.8g  456 R 99.2  0.2   4:02.70 stream   
>
> 45531 brockp20   0 1885m 1.8g  456 R 33.6  0.2   1:20.89 stream   
>
> 45523 brockp20   0 1885m 1.8g  456 R 32.8  0.2   1:20.90 stream   
>
> 45524 brockp20   0 1885m 1.8g  456 R 32.8  0.2   1:20.89 stream   
>
> Note the processes that are not running at 100% cpu, 
>
> hwloc-bind  --get --pid 45523
> 0x0011,0x1133
> 

Hello Brock,

I don't see any helpful to answer here :/

Do you know which core is overloaded and which (two?) cores are idle?
Does that change during one run or from one run to another?
Pressing 1 in top should give that information in the very first lines.
Then, you can try to binding another process to one of the idle cores,
to see if the kernel accepts that.

You can also press "f" and "j" (or "f" and use arrows and space to
select "last used cpu") to add a "P" line which tells you the last CPU
used by each process.
hwloc-bind --get-last-cpu-location --pid  should give the same info
but it seems broken on my machine right now, going to debug.

One thing to check would be to run more than 12 cores and check where
the kernel puts them. If it keeps ignoring two cores, that would be funny :)

Brice