Roland Dreier <[email protected]> wrote: > If it's possible, it would be useful to try with the vanilla kernel > and all upstream > modules too. Otherwise I can't rule out the possiblity that we're > chasing a > bug that OFED introduces.
OK, I've tested the modules from the kernel. What I've done: I've built linux-2.6.39.4 (vanilla) with the belonging Infiniband drivers. I've installed OFED-1.5.4-rc4 as usual and deleted the new modules in '/lib/modules/<version>/update/drivers' (because we want to use the linux-2.6.39.4 modules). I've run 'depmod -a'. When I executed '/etc/init.d/openib start' I've got this kernel log: [ 1438.479977] mlx4_core: Mellanox ConnectX core driver v0.01 (May 1, 2007) [ 1438.480106] mlx4_core: Initializing 0000:12:00.0 [ 1438.480229] mlx4_core 0000:12:00.0: Warning: couldn't set 64-bit PCI DMA mask. [ 1438.480348] mlx4_core 0000:12:00.0: Warning: couldn't set 64-bit consistent PCI DMA mask. [ 1440.546533] Kernel unaligned access at TPC[102df650] mlx4_QUERY_ADAPTER+0xf0/0x11c [mlx4_core] [ 1440.546704] Kernel unaligned access at TPC[102df650] mlx4_QUERY_ADAPTER+0xf0/0x11c [mlx4_core] [ 1440.546865] Kernel unaligned access at TPC[102df650] mlx4_QUERY_ADAPTER+0xf0/0x11c [mlx4_core] [ 1440.547026] Kernel unaligned access at TPC[102df650] mlx4_QUERY_ADAPTER+0xf0/0x11c [mlx4_core] [ 1440.771361] mlx4_core 0000:12:00.0: Sense command failed for port: 1 [ 1440.771926] mlx4_core 0000:12:00.0: Sense command failed for port: 2 [ 1440.884909] mlx4_ib: Mellanox ConnectX InfiniBand driver v1.0 (April 4, 2008) [ 1441.469341] ADDRCONF(NETDEV_UP): ib0: link is not ready [ 1442.860385] ib0: enabling connected mode will cause multicast packet drops [ 1442.886875] ib0: mtu > 2044 will cause multicast packet drops. [ 1443.932557] ib1: enabling connected mode will cause multicast packet drops [ 1443.938749] ib1: mtu > 2044 will cause multicast packet drops. [ 1444.481266] ADDRCONF(NETDEV_CHANGE): ib0: link becomes ready Then I executed 'ibv_devinfo' and got erros (which I hadn't with the kernel modules from OFED): # ibv_devinfo mlx4: There is a mismatch between the kernel and the userspace libraries: Kernel does not support XRC. Exiting. Failed to open device And the error in the kernel log was existent, too: [ 1464.503632] swap_free: Bad swap file entry 100005e000061800 [ 1464.503759] BUG: Bad page map in process ibv_devinfo pte:bc0000c300104848 pmd:00fe46f0 [ 1464.503877] addr:fffff80100114000 vm_flags:000844fa anon_vma: (null) mapping:fffff807f401b6a0 index:6180082 [ 1464.504054] vma->vm_file->f_op->mmap: ib_uverbs_mmap+0x8/0x38 [ib_uverbs] [ 1464.504102] Call Trace: [ 1464.504141] [00000000004cd430] unmap_vmas+0x514/0x7f4 [ 1464.504191] [00000000004d114c] unmap_region+0xb4/0x164 [ 1464.504269] [00000000004d2198] do_munmap+0x2a8/0x31c [ 1464.504323] [000000000042d340] SyS_64_munmap+0x88/0xa8 [ 1464.504378] [0000000000406154] linux_sparc_syscall+0x34/0x44 [ 1464.504423] Disabling lock debugging due to kernel taint [ 1464.504478] swap_free: Bad swap file entry 100005e000061802 [ 1464.504526] BUG: Bad page map in process ibv_devinfo pte:bc0000c300504848 pmd:00fe46f0 [ 1464.504585] addr:fffff8010011a000 vm_flags:000844fa anon_vma: (null) mapping:fffff807f401b6a0 index:6180282 [ 1464.504665] vma->vm_file->f_op->mmap: ib_uverbs_mmap+0x8/0x38 [ib_uverbs] [ 1464.504711] Call Trace: [ 1464.504742] [00000000004cd430] unmap_vmas+0x514/0x7f4 [ 1464.504790] [00000000004d114c] unmap_region+0xb4/0x164 [ 1464.504838] [00000000004d2198] do_munmap+0x2a8/0x31c [ 1464.504888] [000000000042d340] SyS_64_munmap+0x88/0xa8 [ 1464.504967] [0000000000406154] linux_sparc_syscall+0x34/0x44 BTW: I'm still trying to get the new firmware for my adapters. Maybe it clicks tomorrow... Regards, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
