Micheal, Will you be able to help me with some of the issues listed below?
Pradeep [EMAIL PROTECTED] ----- Forwarded by Pradeep Satyanarayana/Beaverton/IBM on 04/13/2007 08:33 AM ----- Pradeep Satyanarayana/Beaverton/IBM 04/12/2007 01:58 PM To [EMAIL PROTECTED] cc "Michael S. Tsirkin" <[EMAIL PROTECTED]> Subject mthca issues -need help I am running into a number of mthca issues listed below and need help with them. 1. I am using linux-2.6.21-rc5 and I see this Oops when I modprobe ib_mthca (on ppc64) Apr 12 14:11:19 elm3b37 kernel: ib_mthca 0002:d9:00.0: HCA FW version 3.3.3 is old (3.4.0 is current). Apr 12 14:11:19 elm3b37 kernel: ib_mthca 0002:d9:00.0: If you have problems, try updating your HCA FW. Apr 12 14:11:19 elm3b37 kernel: Faulting instruction address: 0xd0000000002db0d8 Apr 12 14:11:19 elm3b37 kernel: Oops: Kernel access of bad area, sig: 11 [#2] Apr 12 14:11:19 elm3b37 kernel: SMP NR_CPUS=128 NUMA Apr 12 14:11:19 elm3b37 kernel: Modules linked in: ib_mthca ib_mad ib_ehca ib_core autofs4 ipv6 binfmt_misc parport_pc lp parport sg e1000 dm_snapshot dm_zero dm_mirror dm_mod ipr libata sd_mod scsi_mod firmware_class ehci_hcd ohci_hcd usbcore Apr 12 14:11:19 elm3b37 kernel: NIP: D0000000002DB0D8 LR: D0000000002DAE0C CTR: 0000000000000400 Apr 12 14:11:19 elm3b37 kernel: REGS: c0000000e2116f60 TRAP: 0300 Not tainted (2.6.21-rc5) Apr 12 14:11:19 elm3b37 kernel: MSR: 8000000000009032 <EE,ME,IR,DR> CR: 24024444 XER: 00000008 Apr 12 14:11:19 elm3b37 kernel: DAR: 0000000000002000, DSISR: 0000000042000000 Apr 12 14:11:19 elm3b37 kernel: TASK = c0000000e7de4040[3884] 'modprobe' THREAD: c0000000e2114000 CPU: 0 Apr 12 14:11:19 elm3b37 kernel: GPR00: 0000000040010001 C0000000E21171E0 D000000000308B30 0000000007FFFFFF Apr 12 14:11:19 elm3b37 kernel: GPR04: C0000000E595FE00 0000000000000000 C0000000E2438000 0000000000000400 Apr 12 14:11:19 elm3b37 kernel: GPR08: 0000000000000000 0000000000000400 0000000000002000 0000000000000000 Apr 12 14:11:19 elm3b37 kernel: GPR12: D0000000002EAD28 C000000000535A80 AAAAAAAAAAAAAAAB D0000000005A0C10 Apr 12 14:11:19 elm3b37 kernel: GPR16: 0000000000000000 0000000000000312 0000000000000312 000000000000003F Apr 12 14:11:19 elm3b37 kernel: GPR20: C0000000E595FE20 C0000000E4F04000 C0000000E595FE00 0000000000000000 Apr 12 14:11:19 elm3b37 kernel: GPR24: C0000000E4FAF000 0000000007FFFFFF 0000000000000000 0000000000002000 Apr 12 14:11:19 elm3b37 kernel: GPR28: C0000000E2438000 0000000000000400 D0000000003075B0 0000000000000400 Apr 12 14:11:19 elm3b37 kernel: NIP [D0000000002DB0D8] .mthca_write_mtt+0x328/0x460 [ib_mthca] Apr 12 14:11:19 elm3b37 kernel: LR [D0000000002DAE0C] .mthca_write_mtt+0x5c/0x460 [ib_mthca] Apr 12 14:11:19 elm3b37 kernel: Call Trace: Apr 12 14:11:19 elm3b37 kernel: [C0000000E21171E0] [C0000000E2117300] 0xc0000000e2117300 (unreliable) Apr 12 14:11:19 elm3b37 kernel: [C0000000E21172D0] [D0000000002DBD1C] .mthca_mr_alloc_phys+0x8c/0x140 [ib_mthca] Apr 12 14:11:19 elm3b37 kernel: [C0000000E2117390] [D0000000002D6B6C] .mthca_create_eq+0x3ac/0x5e0 [ib_mthca] Apr 12 14:11:19 elm3b37 kernel: [C0000000E2117490] [D0000000002D7528] .mthca_init_eq_table+0x198/0x790 [ib_mthca] Apr 12 14:11:19 elm3b37 kernel: [C0000000E2117560] [D0000000002D0368] .__mthca_init_one+0xa38/0xd70 [ib_mthca] Apr 12 14:11:19 elm3b37 kernel: [C0000000E2117640] [D0000000002D0714] .mthca_init_one+0x74/0xf0 [ib_mthca] Apr 12 14:11:19 elm3b37 kernel: [C0000000E21176E0] [C0000000002487D8] .pci_device_probe+0x168/0x200 Apr 12 14:11:19 elm3b37 kernel: [C0000000E21177A0] [C0000000002C288C] .really_probe+0xbc/0x1f0 Apr 12 14:11:19 elm3b37 kernel: [C0000000E2117850] [C0000000002C2D3C] .__driver_attach+0xfc/0x140 Apr 12 14:11:19 elm3b37 kernel: [C0000000E21178E0] [C0000000002C1668] .bus_for_each_dev+0x88/0xe0 Apr 12 14:11:19 elm3b37 kernel: [C0000000E21179A0] [C0000000002C2628] .driver_attach+0x28/0x40 Apr 12 14:11:19 elm3b37 kernel: [C0000000E2117A20] [C0000000002C1C34] .bus_add_driver+0xc4/0x220 Apr 12 14:11:19 elm3b37 kernel: [C0000000E2117AC0] [C0000000002C3118] .driver_register+0x78/0xe0 Apr 12 14:11:19 elm3b37 kernel: [C0000000E2117B40] [C000000000248B70] .__pci_register_driver+0x90/0x120 Apr 12 14:11:19 elm3b37 kernel: [C0000000E2117BE0] [D0000000002EA050] .mthca_init+0x100/0x170 [ib_mthca] Apr 12 14:11:19 elm3b37 kernel: [C0000000E2117C70] [C0000000000848FC] .sys_init_module+0x20c/0x1990 Apr 12 14:11:19 elm3b37 kernel: [C0000000E2117E30] [C00000000000862C] syscall_exit+0x0/0x40 Apr 12 14:11:19 elm3b37 kernel: Instruction dump: Apr 12 14:11:19 elm3b37 kernel: 7d290214 7d495a14 409d0038 393fffff 39600000 79290020 39290001 7d2903a6 Apr 12 14:11:19 elm3b37 kernel: 60000000 60000000 7c1c582a 60000001 <7c0a592a> 396b0008 4200fff0 7bfb1f24 2. The above may or may not be a bug and as indicated in the message I wanted to upgrade (the FW). However, I found that the latest firmware is 3.5.0 and not 3.4.0 as the message seems to indicate. I wanted to use IPOIB CM -so which one should I upgrade to - presumably 3.5.0? 3. From the following url http://www.mellanox.com/support/firmware_table_IH.php it is not clear to me as to which firmware I should download. lspci -v shows me : 0002:d9:00.0 InfiniBand: Mellanox Technologies MT23108 InfiniHost (rev a1) Subsystem: Mellanox Technologies MT23108 InfiniHost So, I was planning on using fw-23108-3_5_000-MHET2X-1SC_A1.bin.zip -Is that correct? 3. When I downloaded mft-1.0.1.tar I found that ppc64 is not supported. 4. I moved my HCA to x86_64 and then tried to install mft utilities. There was a previous version of the tool and I asked to uinstall it. After that I see the following: /home/tools/mft-1.0.1 # ./install.sh *** Mellanox Firmware Tools (MFT) Package Installation *** MFT Build 20060118-1817 Copyright (C) June 2002, Mellanox Technologies Ltd. ALL RIGHTS RESERVED. Use of software subject to the terms and conditions detailed in the file "LICENSE.txt". Found a previous installation of the MFT package. Current installed MFT Build ID is 20060118-1817 This installation MFT Build ID is 20060118-1817 Remove currently installed components (run /usr/mellanox/mft/uninstall.sh) ? :(y/n) [n] y Running /usr/mellanox/mft/uninstall.sh ... Uninstall completed successfully. This installation installs the MFT components into /usr Installing MST package under /usr/mst ... MFT Depends on pre-installed MST. Fail to find /usr/mst/lib/libmtcr.a Nowhere could I find the libmtcr.a? I need help with above listed issues. Thanks! Pradeep [EMAIL PROTECTED] _______________________________________________ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
