Hi Jeff, On 11/4/2012 1:11 PM, Jeff Squyres wrote: > Yevgeny - > > Could Mellanox update the FAQ item about this? > > Large-memory nodes are becoming more common.
Sure. But I'd like to hear Paul's input on this first. Did it work with log_num_mtt=26? I don't have that kind of machines to test this. -- YK > > On Nov 3, 2012, at 6:33 PM, Yevgeny Kliteynik wrote: > >> Hi Paul, >> >> On 10/31/2012 10:22 PM, Paul Kapinos wrote: >>> Hello Yevgeny, hello all, >>> >>> Yevgeny, first of all thanks for explaining what the MTT parameters do and >>> why there are two of them! I mean this post: >>> http://www.open-mpi.org/community/lists/devel/2012/08/11417.php >>> >>> Well, the official recommendation is "twice the RAM amount". >>> >>> And here we are: we have 2 nodes with 2 TB (that with a 'tera') RAM and a >>> couple of nodes with 1TB, each with 4x Mellanox IB adapters. Thus we should >>> have raised the MTT parameters in order to make up to 4 TB memory >>> registrable. >> >> You don't really *have* to be able to register twice the available RAM. >> It's just heuristics. It depends on the application that you're running >> and fragmentation that it creates in the MTT. >> >> However: >> >>> I've tried to raise the MTT parameters in multiple combinations, but the >>> maximum amount of registrable memory I was able to get was one TB (23 / 5). >>> All tries to get more (24/5, 23/6 for 2 TB) lead to not responding >>> InfiniBand HCAs. >>> >>> Is there any another limits in the kernel have to be adjusted in order to >>> be able to register that a bunch of memory? >> >> Unfortunately, current driver has a limitation in this area so 1TB >> (23/5 values) is probably the top what the driver can do. >> IIRC, log_num_mtt can reach 26, so perhaps you can try 26/2 (same 1TB), >> and then, if it works, try 26/3 (fingers crossed), which will bring you >> to 2 TB, but I'm not sure it will work. >> >> This has already been fixed, and the fix was accepted to the upstream >> Linux kernel, so it will be included in the next OFED/MLNX_OFED versions. >> >> -- YK >> >> >>> Best, >>> >>> Paul Kapinos >>> >>> >>> >>> >>> >>> _______________________________________________ >>> devel mailing list >>> [email protected] >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> _______________________________________________ >> devel mailing list >> [email protected] >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > >
