Re: using ConnectX card as Ethernet (mlxen)
Hi, On 2014-1-20, at 21:59, John Baldwin j...@freebsd.org wrote: I believe this should work, yes. Getting a crashdump or the panic messages would be really helpful in figuring out why it isn't. Thanks. I rebuilt the kernel, and see no crashes anymore. So that's good. But there are a bunch of other issues that maybe someone has some ideas about: (1) Late attach The ConnectX-3 attaches very late during the boot process, after the system is already in single-user mode. See the attached dmesg; pci17 and pci18 (there are two identical cards in this system) first show as no driver attached during the PCI bus enumeration. Only after the system is single-user mode does the mlx4_core attach to the cards. That means that e.g. trying to set sysctls for these cards in /etc/sysctl.conf, or configuring their IP addresses via rc.conf is not possible. At the moment, I work around this by sleeping in rc.local and then doing assignments there, but that's a hack. Any clues why these cards attach so late? (2) Device numbers change After booting, these cards show up in InfiniBand mode: ib0: flags=8002BROADCAST,MULTICAST metric 0 mtu 65520 options=80018VLAN_MTU,VLAN_HWTAGGING,LINKSTATE lladdr 80.0.0.48.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d1.21 nd6 options=21PERFORMNUD,AUTO_LINKLOCAL ib1: flags=8002BROADCAST,MULTICAST metric 0 mtu 65520 options=80018VLAN_MTU,VLAN_HWTAGGING,LINKSTATE lladdr 80.0.0.49.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d1.22 nd6 options=21PERFORMNUD,AUTO_LINKLOCAL ib2: flags=8002BROADCAST,MULTICAST metric 0 mtu 65520 options=80018VLAN_MTU,VLAN_HWTAGGING,LINKSTATE lladdr 80.0.0.48.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d0.d1 nd6 options=21PERFORMNUD,AUTO_LINKLOCAL ib3: flags=8002BROADCAST,MULTICAST metric 0 mtu 65520 options=80018VLAN_MTU,VLAN_HWTAGGING,LINKSTATE lladdr 80.0.0.49.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d0.d2 nd6 options=21PERFORMNUD,AUTO_LINKLOCAL Then I force one into Ethernet mode: # sysctl sys.device.mlx4_core0.mlx4_port1=eth sys.device.mlx4_core0.mlx4_port1: auto (ib) - eth and the device numbers on the ib devices change: ib1 is now ib4, and I have a new mlxen0 device. ib2: flags=8002BROADCAST,MULTICAST metric 0 mtu 65520 options=80018VLAN_MTU,VLAN_HWTAGGING,LINKSTATE lladdr 80.0.0.48.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d0.d1 nd6 options=21PERFORMNUD,AUTO_LINKLOCAL ib3: flags=8002BROADCAST,MULTICAST metric 0 mtu 65520 options=80018VLAN_MTU,VLAN_HWTAGGING,LINKSTATE lladdr 80.0.0.49.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d0.d2 nd6 options=21PERFORMNUD,AUTO_LINKLOCAL mlxen0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=d05bbRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,LRO,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE ether f4:52:14:10:d1:21 inet6 fe80::f652:14ff:fe10:d121%mlxen0 prefixlen 64 scopeid 0xe nd6 options=21PERFORMNUD,AUTO_LINKLOCAL media: Ethernet autoselect status: no carrier ib4: flags=8002BROADCAST,MULTICAST metric 0 mtu 65520 options=80018VLAN_MTU,VLAN_HWTAGGING,LINKSTATE lladdr 80.0.0.4a.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d1.22 nd6 options=21PERFORMNUD,AUTO_LINKLOCAL When I change another port into Ethernet mode # sysctl sys.device.mlx4_core0.mlx4_port2=eth sys.device.mlx4_core0.mlx4_port2: auto (ib) - eth device numbers change again. Now mxlen0 disappears and becomes mxlen1, and I have a new mxlen2 device: ib2: flags=8002BROADCAST,MULTICAST metric 0 mtu 65520 options=80018VLAN_MTU,VLAN_HWTAGGING,LINKSTATE lladdr 80.0.0.48.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d0.d1 nd6 options=21PERFORMNUD,AUTO_LINKLOCAL ib3: flags=8002BROADCAST,MULTICAST metric 0 mtu 65520 options=80018VLAN_MTU,VLAN_HWTAGGING,LINKSTATE lladdr 80.0.0.49.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d0.d2 nd6 options=21PERFORMNUD,AUTO_LINKLOCAL mlxen1: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=d05bbRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,LRO,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE ether f4:52:14:10:d1:21 inet6 fe80::f652:14ff:fe10:d121%mlxen1 prefixlen 64 scopeid 0xe nd6 options=21PERFORMNUD,AUTO_LINKLOCAL media: Ethernet autoselect status: no carrier mlxen2: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=d05bbRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,LRO,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE ether f4:52:14:10:d1:22 inet6 fe80::f652:14ff:fe10:d122%mlxen2 prefixlen 64 scopeid 0xf nd6 options=21PERFORMNUD,AUTO_LINKLOCAL media: Ethernet autoselect status: no carrier Changing the other two ports (on the second card) to Ethernet mode # sysctl sys.device.mlx4_core1.mlx4_port1=eth sys.device.mlx4_core1.mlx4_port1: auto (ib) -
Re: using ConnectX card as Ethernet (mlxen)
On 2014-1-21, at 10:04, Lars Eggert l...@netapp.com wrote: See the attached dmesg which I of course forget to attach (sigh). See below. Lars GDB: no debug ports present970 KDB: debugger backends: ddb KDB: current backend: ddb Copyright (c) 1992-2014 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 11.0-CURRENT #8 ab08c30(fas3270)-dirty: Tue Jan 21 09:07:36 CET 2014 el...@stanley.muccbc.hq.netapp.com:/usr/home/elars/obj/usr/home/elars/src/sys/FAS3270 amd64 FreeBSD clang version 3.3 (tags/RELEASE_33/final 183502) 20130610 CPU: Intel(R) Xeon(R) CPU E5240 @ 3.00GHz (3000.17-MHz K8-class CPU) Origin=GenuineIntel Id=0x1067a Family=0x6 Model=0x17 Stepping=10 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE Features2=0xc0ce3bdSSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,DCA,SSE4.1,XSAVE,OSXSAVE AMD Features=0x20100800SYSCALL,NX,LM AMD Features2=0x1LAHF TSC: P-state invariant, performance statistics real memory = 18253611008 (17408 MB) avail memory = 16599695360 (15830 MB) MPTable: NETAPP SB_XVI Event timer LAPIC quality 400 FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs FreeBSD/SMP: 2 package(s) x 2 core(s) cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 6 cpu3 (AP): APIC ID: 7 random device not loaded; using insecure entropy ioapic0: Assuming intbase of 0 ioapic0 Version 2.0 irqs 0-23 on motherboard netmap: loaded module random: Software, Yarrow initialized smbios0: System Management BIOS at iomem 0xf6c00-0xf6c1e on motherboard smbios0: Version: 2.5 cryptosoft0: software crypto on motherboard pcib0: MPTable Host-PCI bridge pcibus 0 on motherboard pci0: PCI bus on pcib0 pcib1: MPTable PCI-PCI bridge at device 2.0 on pci0 pci1: PCI bus on pcib1 cxgbc0: Carnegie T3 onboard SR KR, 2 ports mem 0xdd001000-0xdd001fff,0xdc80-0xdcff,0xdd00-0xdd000fff irq 16 at device 0.0 on pci1 cxgbc0: AD8158 0xf=0x3 0x1=0xf cxgbc0: using MSI-X interrupts (9 vectors) cxgb0: Port 0 10GBASE-R on cxgbc0 cxgb0: Ethernet address: 00:a0:98:30:c2:2a cxgb1: Port 1 10GBASE-R on cxgbc0 cxgb1: Ethernet address: 00:a0:98:30:c2:2b cxgbc0: Firmware Version 7.11.0 pcib2: PCI-PCI bridge at device 3.0 on pci0 pci2: PCI bus on pcib2 pcib3: MPTable PCI-PCI bridge at device 4.0 on pci0 pci3: PCI bus on pcib3 pcib4: PCI-PCI bridge mem 0xdd30-0xdd31 irq 16 at device 0.0 on pci3 pci4: PCI bus on pcib4 pcib3: unable to route slot 0 INTB pcib5: PCI-PCI bridge irq 16 at device 4.0 on pci4 pci5: PCI bus on pcib5 pcib6: MPTable PCI-PCI bridge irq 10 at device 5.0 on pci4 pci6: PCI bus on pcib6 ix0: Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 2.5.15 mem 0xdd40-0xdd47,0xdd50-0xdd503fff irq 17 at device 0.0 on pci6 ix0: Using MSIX interrupts with 5 vectors ix0: Ethernet address: 90:e2:ba:37:d5:b4 ix0: PCI Express Bus: Speed 5.0GT/s Width x8 001.08 [2141] netmap_attach success for ix0 ix1: Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 2.5.15 mem 0xdd48-0xdd4f,0xdd504000-0xdd507fff irq 18 at device 0.1 on pci6 ix1: Using MSIX interrupts with 5 vectors ix1: Ethernet address: 90:e2:ba:37:d5:b5 ix1: PCI Express Bus: Speed 5.0GT/s Width x8 001.09 [2141] netmap_attach success for ix1 pcib7: PCI-PCI bridge irq 16 at device 8.0 on pci4 pci7: PCI bus on pcib7 pcib8: PCI-PCI bridge at device 0.0 on pci7 pci8: PCI bus on pcib8 pcib9: MPTable PCI-PCI bridge at device 0.0 on pci8 pci9: PCI bus on pcib9 em0: Intel(R) PRO/1000 Network Connection 7.3.8 mem 0xdd62-0xdd63,0xdd60-0xdd61 irq 16 at device 0.0 on pci9 em0: Using an MSI interrupt em0: Ethernet address: 00:1b:21:a8:a5:34 001.10 [2141] netmap_attach success for em0 em1: Intel(R) PRO/1000 Network Connection 7.3.8 mem 0xdd66-0xdd67,0xdd64-0xdd65 irq 17 at device 0.1 on pci9 em1: Using an MSI interrupt em1: Ethernet address: 00:1b:21:a8:a5:35 001.11 [2141] netmap_attach success for em1 pcib10: MPTable PCI-PCI bridge at device 1.0 on pci8 pci10: PCI bus on pcib10 em2: Intel(R) PRO/1000 Network Connection 7.3.8 mem 0xdd72-0xdd73,0xdd70-0xdd71 irq 17 at device 0.0 on pci10 em2: Using an MSI interrupt em2: Ethernet address: 00:1b:21:a8:a5:36 001.12 [2141] netmap_attach success for em2 em3: Intel(R) PRO/1000 Network Connection 7.3.8 mem 0xdd76-0xdd77,0xdd74-0xdd75 irq 18 at device 0.1 on pci10 em3: Using an MSI interrupt em3: Ethernet address: 00:1b:21:a8:a5:37 001.13 [2141] netmap_attach success for em3 pcib11: PCI-PCI bridge at device 5.0 on pci0 pci11: PCI bus on pcib11 pcib12: PCI-PCI bridge at device 6.0 on pci0 pci12: PCI bus on pcib12 pcib0: unable to
Re: using ConnectX card as Ethernet (mlxen)
Last follow-up: I just saw that there are some additional messages (errors?) on the serial console when changing the device from IB to Ethernet, maybe they mean something to someone: root@one:~ # sysctl sys.device.mlx4_core0.mlx4_port1=eth sys.device.mlx4_core0.mlx4_port1: auto (ib)7ib0: stopping interface 7ib0: downing ib_dev 7ib0: stopping multicast thread 7ib0: flushing multicast list 7qpn 0x48: invalid attribute mask specified for transition 0 to 6. qp_type 4, attr_mask 0x1\n4ib0: Failed to modify QP to ERROR state 7ib0: All sends and receives done. 7ib0: cleaning up ib_dev 7ib0: stopping multicast thread 7ib0: flushing multicast list 7ib0: Cleanup ipoib connected mode. 7ib1: stopping interface 7ib1: downing ib_dev 7ib1: stopping multicast thread 7ib1: flushing multicast list 7qpn 0x49: invalid attribute mask specified for transition 0 to 6. qp_type 4, attr_mask 0x1\n4ib1: Failed to modify QP to ERROR state 7ib1: All sends and receives done. 7ib1: cleaning up ib_dev 7ib1: stopping multicast thread 7ib1: flushing multicast list 7ib1: Cleanup ipoib connected mode. 6mlx4_en mlx4_core0: Using 5 tx rings for port:1 6mlx4_en mlx4_core0: Defaulting to 4 rx rings for port:1 6mlx4_en mlx4_core0: Activating port:1 mlxen0: Ethernet address: f4:52:14:10:d1:21 4mlx4_en: mlx4_core0: Port 1: Using 5 TX rings 4mlx4_en: mlx4_core0: Port 1: Using 4 RX rings 6mlx4_ib: Mellanox ConnectX InfiniBand driver v1.Jan 21 09:21:31 0 (April 4, 2008) one kernel: mlx4_en: mlx4_core0: Port 1: Using 5 TX rings Jan 7ib4: max_srq_sge=31 21 09:21:31 one 7ib4: max_cm_mtu = 0x1, num_frags=16 kernel: mlx4_en:ib4: mlx4_core0: PorAttached to mlx4_0 port 2 t 1: Using 4 RX rings - eth Lars signature.asc Description: Message signed with OpenPGP using GPGMail
Re: using ConnectX card as Ethernet (mlxen)
Hi, On 2013-7-9, at 22:08, John Nielsen li...@jnielsen.net wrote: On Jul 9, 2013, at 9:58 AM, John Baldwin j...@freebsd.org wrote: So this was just fixed (finally) in HEAD in r253048. You can how use the sysctls to change this. I saw the commit. Thanks! I'll give it a try at some point (whenever my schedule and hardware availability align). is this supposed to work at the moment? When I try, the machine seems to crash: root@one:~ # sysctl sys.device.mlx4_core0.mlx4_port1=eth sys.device.mlx4_core0.mlx4_port1: auto (ib) Write failed: Broken pipe Shared connection to xxx closed. Unfortunately I don't have serial console access at the moment, so I can't access any messages that may have gotten dumped. The cards in question are: mlx4_core0@pci0:17:0:0: class=0x028000 card=0x005015b3 chip=0x100315b3 rev=0x00 hdr=0x00 vendor = 'Mellanox Technologies' device = 'MT27500 Family [ConnectX-3]' class = network Lars signature.asc Description: Message signed with OpenPGP using GPGMail
Re: using ConnectX card as Ethernet (mlxen)
Hi, if I leave the mlx4ib device out of the kernel (i.e., only compile in mlxen), doing the sysctl switch to Ethernet mode works fine. Lars On 2014-1-20, at 13:08, Eggert, Lars l...@netapp.com wrote: Hi, On 2013-7-9, at 22:08, John Nielsen li...@jnielsen.net wrote: On Jul 9, 2013, at 9:58 AM, John Baldwin j...@freebsd.org wrote: So this was just fixed (finally) in HEAD in r253048. You can how use the sysctls to change this. I saw the commit. Thanks! I'll give it a try at some point (whenever my schedule and hardware availability align). is this supposed to work at the moment? When I try, the machine seems to crash: root@one:~ # sysctl sys.device.mlx4_core0.mlx4_port1=eth sys.device.mlx4_core0.mlx4_port1: auto (ib) Write failed: Broken pipe Shared connection to xxx closed. Unfortunately I don't have serial console access at the moment, so I can't access any messages that may have gotten dumped. The cards in question are: mlx4_core0@pci0:17:0:0: class=0x028000 card=0x005015b3 chip=0x100315b3 rev=0x00 hdr=0x00 vendor = 'Mellanox Technologies' device = 'MT27500 Family [ConnectX-3]' class = network Lars signature.asc Description: Message signed with OpenPGP using GPGMail
Re: using ConnectX card as Ethernet (mlxen)
On Monday 20 January 2014 12:08:34 Eggert, Lars wrote: Hi, On 2013-7-9, at 22:08, John Nielsen li...@jnielsen.net wrote: On Jul 9, 2013, at 9:58 AM, John Baldwin j...@freebsd.org wrote: So this was just fixed (finally) in HEAD in r253048. You can how use the sysctls to change this. I saw the commit. Thanks! I'll give it a try at some point (whenever my schedule and hardware availability align). is this supposed to work at the moment? When I try, the machine seems to crash: root@one:~ # sysctl sys.device.mlx4_core0.mlx4_port1=eth sys.device.mlx4_core0.mlx4_port1: auto (ib) Write failed: Broken pipe Shared connection to xxx closed. Unfortunately I don't have serial console access at the moment, so I can't access any messages that may have gotten dumped. The cards in question are: mlx4_core0@pci0:17:0:0: class=0x028000 card=0x005015b3 chip=0x100315b3 rev=0x00 hdr=0x00 vendor = 'Mellanox Technologies' device = 'MT27500 Family [ConnectX-3]' class = network I believe this should work, yes. Getting a crashdump or the panic messages would be really helpful in figuring out why it isn't. Thanks. -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: using ConnectX card as Ethernet (mlxen)
On Monday, September 24, 2012 12:37:30 pm John Nielsen wrote: I have a machine running FreeBSD 10.0-CURRENT #0 r240887 amd64 with two ConnectX (InfiniBand) cards. Relevant bits of dmesg and pciconf -lv below. The cards are connected directly to a 10GB Ethernet switch so I need to run them in eth mode rather than ib. Unfortunately they come up in ib mode and I don't know how to change it. The same hardware works fine under CentOS 6.3, though I need to manually set the cards to 'eth' there as well (which I do using a 'connectx_port_config script from Mellanox that twiddles the mlx4_port1 entries under /sys (sysfs). Under FreeBSD I see these sysctls but I can't set them to 'eth' either via /boot/loader.conf or by sysctl after boot, with or without mlxen and/or mlx4ib loaded: sys.device.mlx4_core0.mlx4_port1: ib sys.device.mlx4_core1.mlx4_port1: ib So this was just fixed (finally) in HEAD in r253048. You can how use the sysctls to change this. -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: using ConnectX card as Ethernet (mlxen)
On Jul 9, 2013, at 9:58 AM, John Baldwin j...@freebsd.org wrote: On Monday, September 24, 2012 12:37:30 pm John Nielsen wrote: I have a machine running FreeBSD 10.0-CURRENT #0 r240887 amd64 with two ConnectX (InfiniBand) cards. Relevant bits of dmesg and pciconf -lv below. The cards are connected directly to a 10GB Ethernet switch so I need to run them in eth mode rather than ib. Unfortunately they come up in ib mode and I don't know how to change it. The same hardware works fine under CentOS 6.3, though I need to manually set the cards to 'eth' there as well (which I do using a 'connectx_port_config script from Mellanox that twiddles the mlx4_port1 entries under /sys (sysfs). Under FreeBSD I see these sysctls but I can't set them to 'eth' either via /boot/loader.conf or by sysctl after boot, with or without mlxen and/or mlx4ib loaded: sys.device.mlx4_core0.mlx4_port1: ib sys.device.mlx4_core1.mlx4_port1: ib So this was just fixed (finally) in HEAD in r253048. You can how use the sysctls to change this. I saw the commit. Thanks! I'll give it a try at some point (whenever my schedule and hardware availability align). JN ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org