Roland Dreier <[email protected]> wrote:

> If it's possible, it would be useful to try with the vanilla kernel
> and all upstream
> modules too.  Otherwise I can't rule out the possiblity that we're 
> chasing a
> bug that OFED introduces.

OK, I've tested the modules from the kernel.

What I've done:
I've built linux-2.6.39.4 (vanilla) with the belonging Infiniband drivers.
I've installed OFED-1.5.4-rc4 as usual and deleted the new modules in 
'/lib/modules/<version>/update/drivers' (because we want to use the 
linux-2.6.39.4 modules).
I've run 'depmod -a'.

When I executed '/etc/init.d/openib start' I've got this kernel log:
[ 1438.479977] mlx4_core: Mellanox ConnectX core driver v0.01 (May 1, 2007)
[ 1438.480106] mlx4_core: Initializing 0000:12:00.0
[ 1438.480229] mlx4_core 0000:12:00.0: Warning: couldn't set 64-bit PCI DMA 
mask.
[ 1438.480348] mlx4_core 0000:12:00.0: Warning: couldn't set 64-bit consistent 
PCI DMA mask.
[ 1440.546533] Kernel unaligned access at TPC[102df650] 
mlx4_QUERY_ADAPTER+0xf0/0x11c [mlx4_core]
[ 1440.546704] Kernel unaligned access at TPC[102df650] 
mlx4_QUERY_ADAPTER+0xf0/0x11c [mlx4_core]
[ 1440.546865] Kernel unaligned access at TPC[102df650] 
mlx4_QUERY_ADAPTER+0xf0/0x11c [mlx4_core]
[ 1440.547026] Kernel unaligned access at TPC[102df650] 
mlx4_QUERY_ADAPTER+0xf0/0x11c [mlx4_core]
[ 1440.771361] mlx4_core 0000:12:00.0: Sense command failed for port: 1
[ 1440.771926] mlx4_core 0000:12:00.0: Sense command failed for port: 2
[ 1440.884909] mlx4_ib: Mellanox ConnectX InfiniBand driver v1.0 (April 4, 2008)
[ 1441.469341] ADDRCONF(NETDEV_UP): ib0: link is not ready
[ 1442.860385] ib0: enabling connected mode will cause multicast packet drops
[ 1442.886875] ib0: mtu > 2044 will cause multicast packet drops.
[ 1443.932557] ib1: enabling connected mode will cause multicast packet drops
[ 1443.938749] ib1: mtu > 2044 will cause multicast packet drops.
[ 1444.481266] ADDRCONF(NETDEV_CHANGE): ib0: link becomes ready


Then I executed 'ibv_devinfo' and got erros (which I hadn't with the kernel 
modules from OFED):
# ibv_devinfo 
mlx4: There is a mismatch between the kernel and the userspace libraries: 
Kernel does not support XRC. Exiting.
Failed to open device

And the error in the kernel log was existent, too:
[ 1464.503632] swap_free: Bad swap file entry 100005e000061800
[ 1464.503759] BUG: Bad page map in process ibv_devinfo  pte:bc0000c300104848 
pmd:00fe46f0
[ 1464.503877] addr:fffff80100114000 vm_flags:000844fa anon_vma:          
(null) mapping:fffff807f401b6a0 index:6180082
[ 1464.504054] vma->vm_file->f_op->mmap: ib_uverbs_mmap+0x8/0x38 [ib_uverbs]
[ 1464.504102] Call Trace:
[ 1464.504141]  [00000000004cd430] unmap_vmas+0x514/0x7f4
[ 1464.504191]  [00000000004d114c] unmap_region+0xb4/0x164
[ 1464.504269]  [00000000004d2198] do_munmap+0x2a8/0x31c
[ 1464.504323]  [000000000042d340] SyS_64_munmap+0x88/0xa8
[ 1464.504378]  [0000000000406154] linux_sparc_syscall+0x34/0x44
[ 1464.504423] Disabling lock debugging due to kernel taint
[ 1464.504478] swap_free: Bad swap file entry 100005e000061802
[ 1464.504526] BUG: Bad page map in process ibv_devinfo  pte:bc0000c300504848 
pmd:00fe46f0
[ 1464.504585] addr:fffff8010011a000 vm_flags:000844fa anon_vma:          
(null) mapping:fffff807f401b6a0 index:6180282
[ 1464.504665] vma->vm_file->f_op->mmap: ib_uverbs_mmap+0x8/0x38 [ib_uverbs]
[ 1464.504711] Call Trace:
[ 1464.504742]  [00000000004cd430] unmap_vmas+0x514/0x7f4
[ 1464.504790]  [00000000004d114c] unmap_region+0xb4/0x164
[ 1464.504838]  [00000000004d2198] do_munmap+0x2a8/0x31c
[ 1464.504888]  [000000000042d340] SyS_64_munmap+0x88/0xa8
[ 1464.504967]  [0000000000406154] linux_sparc_syscall+0x34/0x44

BTW: I'm still trying to get the new firmware for my adapters. Maybe it clicks 
tomorrow...

Regards,
Lukas
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to