Thank you for your reply, Reese!
What version of GM are you running?
# rpm -qa |egrep "^gm-[0-9]+|^gm-devel" gm-2.0.24-1 gm-devel-2.0.24-1 Is this too old?
And are you sure that gm_board_info shows all the nodes that are listed in your machine file?
Yes, that was the issue - bad cable connection to my compute node prevented it from being seen on the fabric :( Thanks for pointing this out for me.
Could you send a copy of your gm_board_info output , please?
Sure: # ./gm_board_info GM build ID is "2.0.24_Linux_rc20051223164441PST @dr11.myco.com:/usr/src/redhat/BUILD/gm-2.0.24_Linux Tue Jan 30 23:07:45 EST 2007." Board number 0: lanai_cpu_version = 0x0a00 (LANai10.0) lanai_sram_size = 0x001fe000 (2040K bytes) ROM settings: MAC=00:60:dd:49:1e:bf SN=187449 PC=M3F-PCIXD-2 PN=09-02666 LANai time is 0x209b211b12 ticks, or about 1043 minutes since reset. Mapper is 00:60:dd:49:99:96. Map version is 1965903. 2 hosts. Network is fully configured. This node is "dr11.myco.com" Board has room for 16 ports, 1559 nodes/routes, 16384 cache entries Port token cnt: send=61, recv=253 Port: Status PID 0: BUSY 7489 (this process [gm_board_info]) 1: BUSY 25113 Route table for this node follows: gmID MAC Address gmName Route ---- ----------------- -------------------------------- --------------------- 1 00:60:dd:49:1e:bf dr11.myco.com (this node) 2 00:60:dd:49:99:96 dr05.myco.com 81 (mapper)
A mismatch between the list of nodes actually configured onto the Myrinet fabric and the machine file is a common source of errors like this. The mismatch could be caused by cable failure or other mapping issues.
Could you elaborate on the mapping issues you mentioned? What are they?
Why GM instead of MX, by the way?
We have a few MX cards in-house, but no MX switch due to its current market price. So we're only able to perform MX testing using direct-connection cables, which is not very exciting :) On the contrary, we've already had GM boards and a switch and found it sufficient for OpenMPI testing purposes. Would be great to upgrade to MX in the near future. Thank you very much for your help. Sincerely, Alex.