On Tue, Aug 30, 2011 at 12:23 PM, robert <[email protected]> wrote:
>
>> 32 nodes or 32 cores?  I don't know the details of your cluster so it
>> may be obvious, but make sure you aren't accidentally running too many
>> MPI processes on a given node.
>>
> As far as I understood it it is:
>
> 1 node = 4cores
>
> 4GB/node

This doesn't match the output of the top command you posted below.
The total memory given there is 31 985 140 kilobytes = 30.5034065
gigabytes.

Does the cluster you are on have a public information web page?  That
would probably help clear things up...



> For testing and learning I only used a partition of 32 nodes.
> I have just changed to 128 nodes but this doesn't change anything.
>
>
> If I am running into swap and I use --enable-parmesh this wouldn't
> change much, (since I have one copy of the mesh per mpi-process), right?

The idea would be to run fewer processes per node.  For example, you
could run 1 MPI process each on 128 different nodes, then each of the
individual processes would have access to the full amount of RAM for
the node.  The method for doing this is again cluster dependent; I
don't know if it's possible on your particular cluster.




> top - 20:19:21 up 35 days,  8:55, 51 users,  load average: 0.01, 0.29,
> 0.45
> Tasks: 399 total,   1 running, 397 sleeping,   1 stopped,   0 zombie
> Cpu(s):  0.0%us,  0.2%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
> Mem:  31985140k total, 31158420k used,   826720k free,   274980k buffers
> Swap:  8393952k total,      160k used,  8393792k free, 16572876k cached
>
>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+
> COMMAND
>  2955 bodner    16 0 3392 1932 1244 R    1  0.0   0:00.69
> top
>  6602 bodner    15   0 14296 3248 1864 S    0  0.0   0:10.11
> sshd
>  2829 bodner    15   0 19604 3892 3092 S    0  0.0   0:00.17  mpirun
>
> The last one is the process of interest.

Actually none of these are interesting... we would need to see that
actual processes that mpirun spawned.  That is, if you ran something
like

mpirun -np 4 ./foo

You would need to look for the four instances of "foo" in the top
command, see how much CPU/memory they are consuming.

-- 
John

------------------------------------------------------------------------------
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better 
price-free! And you'll get a free "Love Thy Logs" t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsisghtdev2dev
_______________________________________________
Libmesh-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/libmesh-users

Reply via email to