Yevgeny, Jeff,
I've tried 26/2 on a node with 2TB RAM - the IB cards are not reachable with this setup.

26/3 not yet tested (it's a bit work for our admins to 'repair' a node in case it is not reachable over the IB interface).

By now we've a couple of nodes with up to 2TB RAM running with 23/5 setup; this seem to be the sonic barrier.

Best,

Paul



On 11/04/12 13:29, Yevgeny Kliteynik wrote:
Hi Jeff,

On 11/4/2012 1:11 PM, Jeff Squyres wrote:
Yevgeny -

Could Mellanox update the FAQ item about this?

Large-memory nodes are becoming more common.

Sure. But I'd like to hear Paul's input on this first.
Did it work with log_num_mtt=26?
I don't have that kind of machines to test this.

-- YK



On Nov 3, 2012, at 6:33 PM, Yevgeny Kliteynik wrote:

Hi Paul,

On 10/31/2012 10:22 PM, Paul Kapinos wrote:
Hello Yevgeny, hello all,

Yevgeny, first of all thanks for explaining what the MTT parameters do and why 
there are two of them! I mean this post:
http://www.open-mpi.org/community/lists/devel/2012/08/11417.php

Well, the official recommendation is "twice the RAM amount".

And here we are: we have 2 nodes with 2 TB (that with a 'tera') RAM and a 
couple of nodes with 1TB, each with 4x Mellanox IB adapters. Thus we should 
have raised the MTT parameters in order to make up to 4 TB memory registrable.

You don't really *have* to be able to register twice the available RAM.
It's just heuristics. It depends on the application that you're running
and fragmentation that it creates in the MTT.

However:

I've tried to raise the MTT parameters in multiple combinations, but the 
maximum amount of registrable memory I was able to get was one TB (23 / 5). All 
tries to get more (24/5, 23/6 for 2 TB) lead to not responding InfiniBand HCAs.

Is there any another limits in the kernel have to be adjusted in order to be 
able to register that a bunch of memory?

Unfortunately, current driver has a limitation in this area so 1TB
(23/5 values) is probably the top what the driver can do.
IIRC, log_num_mtt can reach 26, so perhaps you can try 26/2 (same 1TB),
and then, if it works, try 26/3 (fingers crossed), which will bring you
to 2 TB, but I'm not sure it will work.

This has already been fixed, and the fix was accepted to the upstream
Linux kernel, so it will be included in the next OFED/MLNX_OFED versions.

-- YK


Best,

Paul Kapinos





_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to