Hi Kenneth and Fotis, Thank you for your kind and prompt replies. I think I will go in both directions by installing goolf and in the same time installing a new version of goalf that supports OFED. I didn't suffer from this problem with EasyBuild v1.1.0 because what you mentioned Kenneth.
I am looking forward to see both of you in Cyprus in few weeks. Best Regards, Mohammed Gaafar HPC System Administrator Supercomputer Project International School of Information Science Bibliotheca Alexandrina Tel: +20 3 4839999 Ext.: 1453 Cell: +201061822670 / +201117223299 ________________________________________ From: [email protected] [[email protected]] on behalf of Kenneth Hoste [[email protected]] Sent: Monday, September 23, 2013 9:49 AM To: [email protected] Subject: Re: [easybuild] Problem with OpenMPI/1.4.5-GCC-4.6.3-no-OFED Hi Mohammed, On 22 Sep 2013, at 12:57, Mohammed Gaafar wrote: > Dear EasyBuilders, > I am facing a problem with the OpenMPI/1.4.5-GCC-4.6.3-no-OFED module. It > doesn't work with some software packages (NWChem for example) and gives and > error message similar to this one. > > [comp023.local][[32496,1],72][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect] > connect() to 192.168.30.24 failed: Connection refused (111) > > I have installed the same OpenMPI version manually and it worked fine. Also, > I have installed another version of OpenMPI using EasyBuild and it worked > fine. The importance of this module comes from that this is the one included > in the goalf-1.1 module which became very popular on our system at BA and we > always use it to build our software. > > What I understand from this error is that some network interfaces are note > reachable by the MPI. On the other hand, those interfaces are working fine > with the other versions of MPI and doesn't give any error. I don't know if > this is relevant or not but it is reporting this error on the InfiniBand > network (192.168.30.0 subnet). > > Any ideas regarding troubleshooting or solutions to this problem. This is not surprising: the goalf toolchain and the OpenMPI build it uses has the -no-OFED version suffix, indicating that OpenMPI was built without Infiniband support (i.e., --without-openib). In the very early easyconfig files we shipped, the --without-openib was not used explicitly. This was found to be a bug because that allowed OpenMPI to enable IB support by itself, which doesn't match the -no-OFED version suffix. Like Fotis already suggested, the goolf toolchain (which uses OpenBLAS instead of ATLAS) is likely to be a better choice if you need/want to stick with an open source toolchain. If you want to stick with goalf, you should compose a version that does have IB support (let us know if you need help there). I hope this helps, and see you in Cyprus in a couple of weeks. ;-) regards, Kenneth

