On Thu, Jul 05, 2007 at 03:22:12PM -0700, Roland Dreier wrote: > Loading and unloading ib_mthca many times works fine on a non-Xen > system. So there is something different about the Xen environment > that is causing a problem. It could be a bug in mthca exposed by Xen > (eg improper use of of the DMA mapping API or something like that). > > Can you turn on all the memory debugging options like SLAB_DEBUG > etc. and see if it turns up anything?
Well, I turned on slab debug, vm debug and mthca debug. The output is below. Anything interesting in it? # insmod ib_mthca.ko debug_level=1 ib_mthca: Mellanox InfiniBand HCA driver v0.08 (February 14, 2006) ib_mthca: Initializing 0000:08:00.0 PCI: Enabling device 0000:08:00.0 (0000 -> 0002) Slab corruption: start=ffff880098f513b8, len=256 Redzone: 0x1600000016/0x1700000017. Last user: <0000001800000018>(0x1800000018) 000: 17 00 00 00 17 00 00 00 18 00 00 00 18 00 00 00 010: 19 00 00 00 19 00 00 00 1a 00 00 00 1a 00 00 00 020: 1b 00 00 00 1b 00 00 00 1c 00 00 00 1c 00 00 00 030: 1d 00 00 00 1d 00 00 00 1e 00 00 00 1e 00 00 00 040: 1f 00 00 00 1f 00 00 00 00 00 00 00 00 00 00 00 050: 01 00 00 00 01 00 00 00 02 00 00 00 02 00 00 00 Prev obj: start=0000000398f5120b, len=256 Unable to handle kernel paging request at 0000000398f5130b RIP: <ffffffff80277313> print_objinfo+0x22/0xde PGD 9b0a1067 PUD 0 Oops: 0000 1 SMP CPU 0 Modules linked in: ib_mthca nfs lockd nfs_acl sunrpc ib_ipoib ib_cm ib_sa ib_mad ib_core memtrack ipv6 e1000 dm_mod parport_pc lp parport xfs ata_piix ahci piix mptsas mptscsih mptbase scsi_transport_sas raid0 sata_nv libata amd74xx sd_mod scsi_mod ide_disk ide_core Pid: 2193, comm: insmod Not tainted 2.6.18-xen31-smp #6 RIP: e030:<ffffffff80277313> <ffffffff80277313> print_objinfo+0x22/0xde RSP: e02b:ffff88009acfd8c8 EFLAGS: 00010206 RAX: 0000000398f5130b RBX: 00000000008bd8c1 RCX: ffffffffff57c000 RDX: 0000000000000002 RSI: 0000000398f51203 RDI: ffff8800015f20c0 RBP: ffff8800015f20c0 R08: ffff88009ae9e3c8 R09: 00000000000035eb R10: ffff88009acfd818 R11: ffffffff802fd0b5 R12: 0000000398f51203 R13: 0000000000000002 R14: ffff880098f513b0 R15: ffff880098f51000 FS: 00002aaaaadedb00(0000) GS:ffffffff804aa000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 Process insmod (pid: 2193, threadinfo ffff88009acfc000, task ffff88009c3a1080) Stack: 00000000008bd8c1 ffff8800015f20c0 0000000398f51203 0000000000000100 ffff880098f513b0 ffffffff80277521 ffff8800015f20c0 0000000000000000 ffff8800015f20c0 ffff880098f513b0 ffffffff88318ece 00000000000000d0 Call Trace: <ffffffff80277521> check_poison_obj+0x152/0x1ae <ffffffff88318ece> :ib_mthca:mthca_alloc_icm+0xff/0x35c <ffffffff88318ece> :ib_mthca:mthca_alloc_icm+0xff/0x35c <ffffffff80278269> cache_alloc_debugcheck_after+0x34/0x1b0 <ffffffff802784d7> kmem_cache_alloc+0xf2/0x102 <ffffffff88318ece> :ib_mthca:mthca_alloc_icm+0xff/0x35c <ffffffff88319263> :ib_mthca:mthca_alloc_icm_table+0x138/0x227 <ffffffff88307bab> :ib_mthca:mthca_init_hca+0x5ee/0xde7 <ffffffff802bb44d> sysfs_add_file+0x77/0x86 <ffffffff803228d9> device_create_file+0x31/0x39 <ffffffff883088d3> :ib_mthca:__mthca_init_one+0x52f/0xb50 <ffffffff80277073> poison_obj+0x24/0x2d <ffffffff88308f6a> :ib_mthca:mthca_init_one+0x76/0x8b <ffffffff802f54df> pci_device_probe+0x4a/0x70 <ffffffff80324481> driver_probe_device+0x52/0xa8 <ffffffff803245ac> __driver_attach+0x6b/0xa9 <ffffffff80324541> __driver_attach+0x0/0xa9 <ffffffff803239c2> bus_for_each_dev+0x43/0x6e <ffffffff80323d04> bus_add_driver+0x73/0x10f <ffffffff802f50f7> __pci_register_driver+0x57/0x7e <ffffffff88186193> :ib_mthca:mthca_init+0x135/0x148 <ffffffff802478ce> sys_init_module+0x16e1/0x180a <ffffffff802099da> system_call+0x86/0x8b <ffffffff80209954> system_call+0x0/0x8b Code: 48 8b 18 48 89 ef e8 11 fd ff ff 48 8b 30 48 c7 c7 da c3 3e -- Lukáš Hejtmánek _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
