Thanks for the info Roland. I just subscribed to the openfabrics list,
so I'll start emailing there from now on.
I created your script and the mlx4_ib module is loading, so that's a
good step. However, OpenMPI is still not finding the HCAs. I'm getting
another error (on each node) that I didn't post in the Ubuntu forum:
libibverbs: Fatal: couldn't read uverbs ABI version.
I uninstalled and reinstalled the libibverbs1 package (apt-get purge,
apt-get install), but the error persists. Attached is an output from
an lsmod (just to confirm the loaded modules).
Module Size Used by
mlx4_ib 51712 0
mlx4_core 83808 1 mlx4_ib
af_packet 34440 2
nfsd 289192 13
auth_rpcgss 60448 1 nfsd
exportfs 14208 1 nfsd
ipv6 325768 22
nfs 298744 0
lockd 83248 3 nfsd,nfs
nfs_acl 12416 2 nfsd,nfs
sunrpc 221064 11 nfsd,auth_rpcgss,nfs,lockd,nfs_acl
iptable_filter 11776 0
ip_tables 31720 1 iptable_filter
x_tables 30728 1 ip_tables
ac 15496 0
lp 22084 0
loop 28676 0
ib_mthca 147716 0
ib_mad 50340 2 mlx4_ib,ib_mthca
ib_core 70400 3 mlx4_ib,ib_mthca,ib_mad
container 13824 0
amd_rng 11656 0
i2c_amd756 16004 0
serio_raw 16260 0
i2c_core 35712 1 i2c_amd756
button 18080 0
k8temp 14848 0
parport_pc 48296 1
parport 51340 2 lp,parport_pc
shpchp 45340 0
pci_hotplug 41776 1 shpchp
pcspkr 12160 0
evdev 22144 4
psmouse 53404 0
ext3 156176 6
jbd 64168 1 ext3
mbcache 18560 1 ext3
sr_mod 27300 0
cdrom 48680 1 sr_mod
pata_acpi 17024 0
sg 48920 0
sd_mod 40448 20
floppy 76264 0
sata_sil 21640 13
tg3 131972 0
pata_amd 23940 0
ohci_hcd 34692 0
sata_promise 24580 2
ata_generic 17156 0
usbcore 176816 2 ohci_hcd
libata 183472 5
pata_acpi,sata_sil,pata_amd,sata_promise,ata_generic
scsi_mod 185528 4 sr_mod,sg,sd_mod,libata
dm_mirror 33408 0
dm_snapshot 27848 0
dm_mod 78200 5 dm_mirror,dm_snapshot
thermal 26912 0
processor 49608 1 thermal
fan 13960 0
fbcon 53504 0
tileblit 11264 1 fbcon
font 17280 1 fbcon
bitblit 14592 1 fbcon
softcursor 10880 1 bitblit
fuse 63280 1
-------------------------------------------
Chris Tanner
Space Systems Design Lab
Georgia Institute of Technology
[EMAIL PROTECTED]
-------------------------------------------
On Aug 26, 2008, at 11:11 AM, Roland Dreier wrote:
Apologies for contacting you directly - are you available to provide
some support for the Infiniband packages you built for Ubuntu 8.04? I
posted my issue on the Ubuntu forums a while ago
(http://ubuntuforums.org/showthread.php?t=896924 ), but haven't
received any response yet.
No problem, I don't really read web forums. The best way to get help
with IB in general is to email the list [EMAIL PROTECTED]
In your particular case I would guess the problem is that you don't
have
the mlx4_ib module loaded; by default only the mlx4_core module will
be
auto-loaded. To test this, you can do "sudo modprobe mlx4_ib" by hand
and try it. (This is all based on the fact that you installed the
libmlx4 packages, so I'm assuming you have ConnectX cards).
A better solution would be to create a file named
/etc/modprobe.d/mlx4_core with the line
install mlx4_core /sbin/modprobe --ignore-install mlx4_core && /sbin/
modprobe mlx4_ib
in it, which should make mlx4_ib load by default on boot.
Also, if you are willing to provide support, I would like your
permission to post some of your suggestions and the ultimate
resolution on the Ubuntu forum so that others can benefit from our
dialogue.
Go ahead and add it to the thread.
- R.
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general